Music Recommender Systems

  • 1,411 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,411
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
64
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Music Recommender Systems 超群 .com [email_address] http://www.fuchaoqun.com 2009.11.22 Beta 技术沙龙 官方 twitter : @betasalon 官方网站: http://club.blogbeta.com 邮件组: http://groups.google.com/group/betasalon?hl=zh-CN
  • 2. Music Recommender Systems 超群 .com [email_address] http:// www.fuchaoqun.com
  • 3. Who is Using Recommender Systems?
  • 4. Recommender Systems
    • Summary :
      • http://en.wikipedia.org/wiki/Recommender_system
    • Keywords :
    • recommender system s 、 association rules 、 collaborative filtering 、 slope one 、 SVD 、 KNN....
  • 5. Algorithms
    • Association Rules
    • Slope one
    • SVD
    • … .
  • 6. Algorithms
    • Association Rules
    • Slope one
    • SVD
    • … .
  • 7. Association Rules TID Items 1 Bread 、 Milk 2 Bread 、 Diaper 、 Beer 、 Egg 3 Diaper 、 Beer 、 Cola 4 Bread 、 Milk 、 Diaper 、 Beer 5 Bread 、 Milk 、 Diaper 、 Cola Items Times Beer 、 Diaper 3 Bread 、 Milk 3 Beer 、 Bread 2 Diaper 、 Milk 2 Beer 、 Milk 1
  • 8. Association Rules
    • Support :
    • Confidence:
    • Algorithms : Apriori algorithm 、 FP-growth algorithm
    • http:// en.wikipedia.org/wiki/Association_rule_learning
    • Demo : Python + Orange
    • http://www.fuchaoqun.com/2008/08/data-mining-with-python-orange-association_rule/
  • 9. Algorithms
    • Association Rules
    • Slope one
    • SVD
    • … .
  • 10. Slope One User That is it Straight Through My Heart Jim 4 5 Mike 2 4 Fred 3 ?
  • 11. Slope One
    • By Daniel Lemire in 2005
      • http://www.daniel-lemire.com/fr/abstracts/SDM2005.html
    • Simper Could Be Better
    • Weighted Average :
    • http:// en.wikipedia.org/wiki/Slope_One
    • Implements: http:// taste.sourceforge.net / (Java) http:// code.google.com/p/openslopeone (PHP&MySQL)
  • 12. Algorithms
    • Association Rules
    • Slope one
    • SVD
    • … .
  • 13. Similarity Similarity :
  • 14. SVD Image copy from Here
  • 15. SVD In Image Compression Original K=10 K=20
  • 16. Process SVD
    • Define the original user-item matrix, R, of size m x n, which includes the ratings of m users on n items. r ij refers to the rating of user u i on item i j .
    • Preprocess user-item matrix R in order to eliminate all missing data values.
    • Compute the SVD of R and obtain matrices U, S and V , of size m x m, m x n, and n x n, respectively. Their relationship is expressed by: R =U * S * V T .
    • Perform the dimensionality reduction step by keeping only k diagonal entries from matrix S to obtain a k x k matrix, S k . Similarly, matrices U k and V k of size m x k and k x n are generated. The "reduced" user-item matrix, R ’ , is obtained by R ’ = U k * S k * V k T , while r ' ij denotes the rating by user u i on item i j as included in this reduced matrix.
    • Compute sqrt(S k ) and then calculate two matrix products: U k * sqrt(S k ) T , which represents m users and sqrt(S k ) * V k T , which represents n items in the k dimen-sional feature space. We are particularly interested in the latter matrix, of size k x n.
    • Use KNN on user matrix and item matrix, or you can multiply them to get user's rating on every item.
  • 17. Demo from Here Which two people have the most similar tastes? Which two season are the most close?
  • 18. Demo
  • 19. Demo
  • 20. SVD
    • SVD
      • matlab
      • LAPCKL 、 BLAS ( Fortran )
      • numpy 、 scipy ( Python )
      • SVDLIBC 、 Meschach (C)
      • http://en.wikipedia.org/wiki/Singular_value_decomposition
      • ……
    • KNN:
      • matlab
      • FLANN
      • ……
    • All in one solution :
      • DIVISI
      • ……
  • 21. MAGIC DIVISI ! #!/usr/bin/env python #coding=utf-8 import divisi from divisi.cnet import * data = divisi.SparseLabeledTensor(ndim = 2) # read some rating into data # data[user_id, song_id] = 4 svd_result = data.svd(k = 128) # get songs that the user may like # predict_features(svd_result, user_id).top_items(100) # get similar songs # feature_similarity(svd_result, song_id).top_items(100) # get users that have similar tastes # concept_similarity(svd_result, user_id).top_items(100)
  • 22. Music Recommender Systems
    • Data collection
    • Data Cleaning
    • Data Preprocessing
    • Data Mining
    • Tracking & Optimization
  • 23. Data collection
    • User rating
    • User collection
    • User listen log
    • User view log
    • … .
  • 24. Data Cleaning
    • Missing data
    • Wrong data
    • Noise data
    • Duplicate data
    • … .
    UserId SongId Times 3306 3654 200 3306 6950 236 3306 6528 268 3306 5874 3306 9527 foo 3306 5624 1000000 3306 9635 5 3306 6950 236 … . … . … .
  • 25. Data Preprocessing UserId SongId Times 3306 3654 200 3306 6950 236 3306 6528 268 3306 5874 325 3306 9527 126 3306 5624 98 3306 9635 115 3306 6962 210 … . … . … . UserId SongId Weight 3306 3654 0.62 3306 6950 0.73 3306 6528 0.82 3306 5874 1 3306 9527 0.39 3306 5624 0.30 3306 9635 0.35 3306 6962 0.65 … . … . … .
  • 26. Data Mining UserId SongId Weight 3306 3654 0.62 3306 6950 0.73 3306 6528 0.82 3306 5874 1 3306 9527 0.39 3306 5624 0.30 3306 9635 0.35 3306 6962 0.65 … . … . … . UserId Similary Users’ Id … . … . SongId Similary Songs’ Id … . … .
  • 27. Tracking & Optimization
    • Recommended result
    • User view and click what he like
    • Store user's click
    • Data Mining
    • Better recommendation
  • 28. That's it, Thanks. Q&A