Amazon.com Recommendation Item-to-Item Collaborative Filtering Authors: Greg Linden,Brent Smith,and Jeremy York Origin: JANUARY • FEBRUARY 2003 Published by the IEEE  Computer Society Reporter: 朱韋恩 Date: 2008/11/3
Outline Introduction Problems Recommendation Algorithms Comparison Conclusion
Recommender system in our life
Some problem Many applications require the results set to be returned in realtime New customers typically have extremely limited information Customer data is volatile
Three common approaches to solving the problem  Traditional collaborative filtering Cluster models Search-based methods Amazon.com Item-to-Item CF Algorithm
Traditional Collaborative Filtering Nearest-Neighbor CF algorithm Cosine distance For N-dimensional vector of items, measure two customers A and B
Traditional Collaborative Filtering Disadvantage 1.examines only a small customer sample... 2.item-space partitioning ... 3.If discards the most popular or unpopular  items...
Cluster Models Goal: Divide the customer base into many segments and assign the user to the segment containing the most similar  customers
Cluster Models Advantage in  smaller size of group  have better online scalability and performance Disadvantage complex and expensive clustering computation is run offline. However, recommendation quality is low.
Search-Based Methods Given the user ’ s purchased and rated items, constructs a search query to find other popular items For example, same author, artist, director, or similar keywords
Search-Based Methods If the user has few purchases or ratings, search-based recommendation algorithms scale and perform well If users with thousands of purchases, it is impractical to base a query on all the items
Search-Based Methods Disadvantage 1.too general  2.too narrow
Item-to-Item Collaborative Filtering Rather than matching the user to similar customers, build a similar-items table by finding that customers tend to purchase together Amazon.com used this method
Amazon.com
Amazon.com
Item-to-Item CF Algorithm For each item in product catalog,  I1 For each customer  C who purchased I1 For each item  I2 purchased by customer  C Record that a customer purchased  I1  and  I2 For each item  I2 Compute the similarity between  I1 and I2
Item-to-Item Collaborative Filtering Advantage Incerase the scalability and performance
Scalability: A Comparison Traditional CF: Impractical on large data sets Cluster models: Perform much of the computation offline, but recommendation quality is relatively poor Search-based models: Scale poorly for customers with numerous purchases and ratings
Scalability: A Comparison Item-to-Item CF: -creates the similar-items table offline   -fast for extremely large data set -quality is excellent -performs well with limited user data
Conclusion

Amazon Item-to-Item Recommendations

  • 1.
    Amazon.com Recommendation Item-to-ItemCollaborative Filtering Authors: Greg Linden,Brent Smith,and Jeremy York Origin: JANUARY • FEBRUARY 2003 Published by the IEEE Computer Society Reporter: 朱韋恩 Date: 2008/11/3
  • 2.
    Outline Introduction ProblemsRecommendation Algorithms Comparison Conclusion
  • 3.
  • 4.
    Some problem Manyapplications require the results set to be returned in realtime New customers typically have extremely limited information Customer data is volatile
  • 5.
    Three common approachesto solving the problem Traditional collaborative filtering Cluster models Search-based methods Amazon.com Item-to-Item CF Algorithm
  • 6.
    Traditional Collaborative FilteringNearest-Neighbor CF algorithm Cosine distance For N-dimensional vector of items, measure two customers A and B
  • 7.
    Traditional Collaborative FilteringDisadvantage 1.examines only a small customer sample... 2.item-space partitioning ... 3.If discards the most popular or unpopular items...
  • 8.
    Cluster Models Goal:Divide the customer base into many segments and assign the user to the segment containing the most similar customers
  • 9.
    Cluster Models Advantagein smaller size of group have better online scalability and performance Disadvantage complex and expensive clustering computation is run offline. However, recommendation quality is low.
  • 10.
    Search-Based Methods Giventhe user ’ s purchased and rated items, constructs a search query to find other popular items For example, same author, artist, director, or similar keywords
  • 11.
    Search-Based Methods Ifthe user has few purchases or ratings, search-based recommendation algorithms scale and perform well If users with thousands of purchases, it is impractical to base a query on all the items
  • 12.
    Search-Based Methods Disadvantage1.too general 2.too narrow
  • 13.
    Item-to-Item Collaborative FilteringRather than matching the user to similar customers, build a similar-items table by finding that customers tend to purchase together Amazon.com used this method
  • 14.
  • 15.
  • 16.
    Item-to-Item CF AlgorithmFor each item in product catalog, I1 For each customer C who purchased I1 For each item I2 purchased by customer C Record that a customer purchased I1 and I2 For each item I2 Compute the similarity between I1 and I2
  • 17.
    Item-to-Item Collaborative FilteringAdvantage Incerase the scalability and performance
  • 18.
    Scalability: A ComparisonTraditional CF: Impractical on large data sets Cluster models: Perform much of the computation offline, but recommendation quality is relatively poor Search-based models: Scale poorly for customers with numerous purchases and ratings
  • 19.
    Scalability: A ComparisonItem-to-Item CF: -creates the similar-items table offline -fast for extremely large data set -quality is excellent -performs well with limited user data
  • 20.