Amazon Item-to-Item Recommendations

11,951 views

Published on

Published in: Technology, Education

Amazon Item-to-Item Recommendations

  1. 1. Amazon.com Recommendation Item-to-Item Collaborative Filtering Authors: Greg Linden,Brent Smith,and Jeremy York Origin: JANUARY • FEBRUARY 2003 Published by the IEEE Computer Society Reporter: 朱韋恩 Date: 2008/11/3
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Problems </li></ul><ul><li>Recommendation Algorithms </li></ul><ul><li>Comparison </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Recommender system in our life
  4. 4. Some problem <ul><li>Many applications require the results set to be returned in realtime </li></ul><ul><li>New customers typically have extremely limited information </li></ul><ul><li>Customer data is volatile </li></ul>
  5. 5. Three common approaches to solving the problem <ul><li>Traditional collaborative filtering </li></ul><ul><li>Cluster models </li></ul><ul><li>Search-based methods </li></ul><ul><li>Amazon.com </li></ul><ul><li>Item-to-Item CF Algorithm </li></ul>
  6. 6. Traditional Collaborative Filtering <ul><li>Nearest-Neighbor CF algorithm </li></ul><ul><li>Cosine distance </li></ul><ul><ul><li>For N-dimensional vector of items, measure two customers A and B </li></ul></ul>
  7. 7. Traditional Collaborative Filtering <ul><li>Disadvantage </li></ul><ul><li>1.examines only a small customer sample... </li></ul><ul><li>2.item-space partitioning ... </li></ul><ul><li>3.If discards the most popular or unpopular </li></ul><ul><li>items... </li></ul>
  8. 8. Cluster Models <ul><li>Goal: </li></ul><ul><li>Divide the customer base into many segments and assign the user to the segment containing the most similar customers </li></ul>
  9. 9. Cluster Models <ul><li>Advantage </li></ul><ul><li>in smaller size of group have better online scalability and performance </li></ul><ul><li>Disadvantage </li></ul><ul><li>complex and expensive clustering computation is run offline. However, recommendation quality is low. </li></ul>
  10. 10. Search-Based Methods <ul><li>Given the user ’ s purchased and rated items, constructs a search query to find other popular items </li></ul><ul><li>For example, same author, artist, director, or similar keywords </li></ul>
  11. 11. Search-Based Methods <ul><li>If the user has few purchases or ratings, search-based recommendation algorithms scale and perform well </li></ul><ul><li>If users with thousands of purchases, </li></ul><ul><li>it is impractical to base a query on all the items </li></ul>
  12. 12. Search-Based Methods <ul><li>Disadvantage </li></ul><ul><li>1.too general </li></ul><ul><li>2.too narrow </li></ul>
  13. 13. Item-to-Item Collaborative Filtering <ul><li>Rather than matching the user to similar customers, build a similar-items table by finding that customers tend to purchase together </li></ul><ul><li>Amazon.com used this method </li></ul>
  14. 14. Amazon.com
  15. 15. Amazon.com
  16. 16. Item-to-Item CF Algorithm <ul><li>For each item in product catalog, I1 </li></ul><ul><li>For each customer C who purchased I1 </li></ul><ul><li>For each item I2 purchased by </li></ul><ul><li>customer C </li></ul><ul><li>Record that a customer purchased I1 and I2 </li></ul><ul><li>For each item I2 </li></ul><ul><li>Compute the similarity between I1 and I2 </li></ul>
  17. 17. Item-to-Item Collaborative Filtering <ul><li>Advantage </li></ul><ul><li>Incerase the scalability and performance </li></ul>
  18. 18. Scalability: A Comparison <ul><li>Traditional CF: </li></ul><ul><li>Impractical on large data sets </li></ul><ul><li>Cluster models: </li></ul><ul><li>Perform much of the computation offline, but recommendation quality is relatively poor </li></ul><ul><li>Search-based models: </li></ul><ul><li>Scale poorly for customers with numerous purchases and ratings </li></ul>
  19. 19. Scalability: A Comparison <ul><li>Item-to-Item CF: </li></ul><ul><li>-creates the similar-items table offline </li></ul><ul><li>-fast for extremely large data set </li></ul><ul><li>-quality is excellent </li></ul><ul><li>-performs well with limited user data </li></ul>
  20. 20. Conclusion

×