Your SlideShare is downloading. ×
0
Recommendation Engines:A key personalization feature of modern web applications           Haralambos (Babis) Marmanis     ...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction    Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryRecommendations in Action...
Introduction    Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryRecommendations in Action...
Introduction    Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryRecommendations in Action...
Introduction        Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary“It’s the Economy ......
Introduction        Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary“It’s the Economy ......
Introduction        Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary“It’s the Economy ......
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryJava source codeYooree...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe Online Music Store E...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe Online Music Store E...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe Online Music Store E...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummarySimilarity       The not...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummarySimilarity       The not...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummarySimilarity       The not...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummarySimilarity       The not...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummarySimilarity       The not...
Introduction      Basic Concepts     Collaborative Filtering       Content based   Netflix Prize    SummaryDistance (formul...
Introduction      Basic Concepts     Collaborative Filtering       Content based   Netflix Prize    SummaryDistance (formul...
Introduction      Basic Concepts     Collaborative Filtering       Content based   Netflix Prize    SummaryDistance (formul...
Introduction      Basic Concepts     Collaborative Filtering       Content based   Netflix Prize    SummaryDistance (formul...
Introduction       Basic Concepts    Collaborative Filtering     Content based   Netflix Prize    SummarySimilarity (formul...
Introduction       Basic Concepts    Collaborative Filtering     Content based   Netflix Prize    SummarySimilarity (formul...
Introduction       Basic Concepts    Collaborative Filtering     Content based   Netflix Prize    SummarySimilarity (formul...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe ”best” Similarity ...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe ”best” Similarity ...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe ”best” Similarity ...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryThe ”best” Similarity ...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Tapestry          ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Tapestry          ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Tapestry          ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Tapestry          ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Tapestry          ...
Introduction        Basic Concepts    Collaborative Filtering   Content based        Netflix Prize   SummaryUser based     ...
Introduction   Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryUser based      User Simil...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryRating Counting Matrix...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryRating Counting Matrix ...
Introduction   Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Item Simil...
Introduction    Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      BeanShell...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Peruse t...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Peruse t...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Peruse t...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Peruse t...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryItem based      Peruse t...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryText Parsing & Analysis...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryText Parsing & Analysis...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryText Parsing & Analysis...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryText Parsing & Analysis...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryDocument representation ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryDocument representation ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryDocument representation ...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryNetflix Prize Descripti...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryNetflix Prize Descripti...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryNetflix Prize Descripti...
Introduction       Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryNetflix Prize Descripti...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryLessons learned      Im...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryLessons learned      Im...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryLessons learned      Im...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryLessons learned      Im...
Introduction      Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   SummaryPresentation Outline   ...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Important consider...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Important consider...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Important consider...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Important consider...
Introduction     Basic Concepts   Collaborative Filtering   Content based   Netflix Prize   Summary      Important consider...
Upcoming SlideShare
Loading in...5
×

Recommendation Engines

420

Published on

Modern web applications embrace personalization in order to provide a unique customer experience. Recommendation engines, in general, and Collaborative Filtering, in particular, are essential techniques for delivering state-of-the-art personalization effects on a web site.

These slides are based on a presentation that I gave to New England's Java User Group (NEJUG) in 2009; in that respect, they are quite old. Nevertheless, the content is about the fundamental concepts of these techniques and the fundamentals never go out of fashion.

The code references are from the project Yooreeka. The Yooreeka project started with the code of the book "Algorithms of the Intelligent Web " (Manning 2009). You can find the Yooreeka 2.0 API (Javadoc) at http://www.marmanis.com/static/javadoc/index.html

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
420
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Recommendation Engines"

  1. 1. Recommendation Engines:A key personalization feature of modern web applications Haralambos (Babis) Marmanis NEJUG June 11, 2009
  2. 2. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  3. 3. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryRecommendations in ActionOnline store recommendations Amazon.com Provide recommendations for purchasing more items
  4. 4. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryRecommendations in ActionOnline store recommendations Netflix.com Provide recommendations for viewing more movies
  5. 5. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryRecommendations in ActionContent recommendations Any news portal or other content aggregator Recommendations for articles, books, news stories
  6. 6. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary“It’s the Economy ...”The Long Tail Goodbye Pareto Principle, Hello Long Tail Erik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith, used a log-linear curve to describe the relationship between Amazon.com sales and sales ranking. They found that a large proportion of Amazon.com’s book sales come from obscure books that were not available in brick-and-mortar stores. They also found that consumer benefit from access to increased product variety in online book stores is ten times larger than their benefit from access to lower prices online!
  7. 7. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary“It’s the Economy ...”The Long Tail Goodbye Pareto Principle, Hello Long Tail Erik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith, used a log-linear curve to describe the relationship between Amazon.com sales and sales ranking. They found that a large proportion of Amazon.com’s book sales come from obscure books that were not available in brick-and-mortar stores. They also found that consumer benefit from access to increased product variety in online book stores is ten times larger than their benefit from access to lower prices online!
  8. 8. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary“It’s the Economy ...”The Long Tail Goodbye Pareto Principle, Hello Long Tail Erik Brynjolfsson, Yu (Jeffrey) Hu, and Michael D. Smith, used a log-linear curve to describe the relationship between Amazon.com sales and sales ranking. They found that a large proportion of Amazon.com’s book sales come from obscure books that were not available in brick-and-mortar stores. They also found that consumer benefit from access to increased product variety in online book stores is ten times larger than their benefit from access to lower prices online!
  9. 9. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryJava source codeYooreeka! Open Source, Machine Learning library Search, recommendations, clustering, classification, and combination of classifiers! URL: http://code.google.com/p/yooreeka/
  10. 10. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  11. 11. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe Online Music Store Example Frank’s music ratings
  12. 12. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe Online Music Store Example Constantine’s music ratings
  13. 13. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe Online Music Store Example Catherine’s music ratings
  14. 14. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity The notion of Similarity Often based on the notion of distance The smaller the distance, the greater the similarity Similarity values, typically, constrained in [0,∞) or [0,1] It is not necessary to define similarity formulas. E.g. if d < then similar, otherwise not. Similarity could also be empirical or probabilistic
  15. 15. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity The notion of Similarity Often based on the notion of distance The smaller the distance, the greater the similarity Similarity values, typically, constrained in [0,∞) or [0,1] It is not necessary to define similarity formulas. E.g. if d < then similar, otherwise not. Similarity could also be empirical or probabilistic
  16. 16. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity The notion of Similarity Often based on the notion of distance The smaller the distance, the greater the similarity Similarity values, typically, constrained in [0,∞) or [0,1] It is not necessary to define similarity formulas. E.g. if d < then similar, otherwise not. Similarity could also be empirical or probabilistic
  17. 17. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity The notion of Similarity Often based on the notion of distance The smaller the distance, the greater the similarity Similarity values, typically, constrained in [0,∞) or [0,1] It is not necessary to define similarity formulas. E.g. if d < then similar, otherwise not. Similarity could also be empirical or probabilistic
  18. 18. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity The notion of Similarity Often based on the notion of distance The smaller the distance, the greater the similarity Similarity values, typically, constrained in [0,∞) or [0,1] It is not necessary to define similarity formulas. E.g. if d < then similar, otherwise not. Similarity could also be empirical or probabilistic
  19. 19. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDistance (formulas) Let Xi and Yi be two vectors in RN Minkowski or p-norm distance 1 N p d = |Xi − Yi |p (1) i=1 Manhattan distance d = max |Xi − Yi | (2) i Chebychev or L∞ distance 1 N p d = lim |Xi − Yi |p (3) p→∞ i=1
  20. 20. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDistance (formulas) Let Xi and Yi be two vectors in RN Minkowski or p-norm distance 1 N p d = |Xi − Yi |p (1) i=1 Manhattan distance d = max |Xi − Yi | (2) i Chebychev or L∞ distance 1 N p d = lim |Xi − Yi |p (3) p→∞ i=1
  21. 21. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDistance (formulas) Let Xi and Yi be two vectors in RN Minkowski or p-norm distance 1 N p d = |Xi − Yi |p (1) i=1 Manhattan distance d = max |Xi − Yi | (2) i Chebychev or L∞ distance 1 N p d = lim |Xi − Yi |p (3) p→∞ i=1
  22. 22. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDistance (formulas) Let Xi and Yi be two vectors in RN Minkowski or p-norm distance 1 N p d = |Xi − Yi |p (1) i=1 Manhattan distance d = max |Xi − Yi | (2) i Chebychev or L∞ distance 1 N p d = lim |Xi − Yi |p (3) p→∞ i=1
  23. 23. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity (formulas) Na¨ve Similarity ı β simNaive = (4) β+d where d is the Euclidean distance. Similarity I simI = 1 − tanh(σ) (5) where σ is the biased estimator of sample variance Similarity II common simII = simI × (6) maximum There is more . . . Jaccard, Tanimoto, and so on
  24. 24. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity (formulas) Na¨ve Similarity ı β simNaive = (4) β+d where d is the Euclidean distance. Similarity I simI = 1 − tanh(σ) (5) where σ is the biased estimator of sample variance Similarity II common simII = simI × (6) maximum There is more . . . Jaccard, Tanimoto, and so on
  25. 25. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummarySimilarity (formulas) Na¨ve Similarity ı β simNaive = (4) β+d where d is the Euclidean distance. Similarity I simI = 1 − tanh(σ) (5) where σ is the biased estimator of sample variance Similarity II common simII = simI × (6) maximum There is more . . . Jaccard, Tanimoto, and so on
  26. 26. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe ”best” Similarity formula Which is the best similarity formula? There is no such thing! It depends on the problem, the data, the definition of ... ”best” ¨ ¨ ¨ Spertus,Sahami, and Buyukkokten (2005) Evaluating similarity measures: a large-scale study in the orkut social network. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining The simple L2 based (cosine) similarity showed the best empirical results among seven similarity metrics.
  27. 27. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe ”best” Similarity formula Which is the best similarity formula? There is no such thing! It depends on the problem, the data, the definition of ... ”best” ¨ ¨ ¨ Spertus,Sahami, and Buyukkokten (2005) Evaluating similarity measures: a large-scale study in the orkut social network. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining The simple L2 based (cosine) similarity showed the best empirical results among seven similarity metrics.
  28. 28. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe ”best” Similarity formula Which is the best similarity formula? There is no such thing! It depends on the problem, the data, the definition of ... ”best” ¨ ¨ ¨ Spertus,Sahami, and Buyukkokten (2005) Evaluating similarity measures: a large-scale study in the orkut social network. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining The simple L2 based (cosine) similarity showed the best empirical results among seven similarity metrics.
  29. 29. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryThe ”best” Similarity formula Which is the best similarity formula? There is no such thing! It depends on the problem, the data, the definition of ... ”best” ¨ ¨ ¨ Spertus,Sahami, and Buyukkokten (2005) Evaluating similarity measures: a large-scale study in the orkut social network. Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining The simple L2 based (cosine) similarity showed the best empirical results among seven similarity metrics.
  30. 30. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  31. 31. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Tapestry Experimental mail system by Goldberg et al. (circa 1992) in Xerox PARC
  32. 32. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Tapestry Experimental mail system by Goldberg et al. (circa 1992) in Xerox PARC
  33. 33. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Tapestry Experimental mail system by Goldberg et al. (circa 1992) in Xerox PARC
  34. 34. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Tapestry Experimental mail system by Goldberg et al. (circa 1992) in Xerox PARC
  35. 35. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Tapestry Experimental mail system by Goldberg et al. (circa 1992) in Xerox PARC
  36. 36. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryUser based User Similarity Matrix U1 U2 U3 U4 U5 .. U1 [ S11 S12 S13 S14 S15 ... ] U2 [ S21 S22 S23 S24 S25 ... ] U3 [ S31 S32 S33 S34 S35 ... ] U4 [ S41 S42 S43 S44 S45 ... ] U5 [ S51 S52 S53 S54 S55 ... ] .. [ ... ... ... ... ... ... ]
  37. 37. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryUser based User Similarity Matrix (cont.) U1 U2 U3 U4 U5 .. U1 [1.0, 0.333, 0.385, 0.333, 0.364, ... ] U2 [0.0, 1.000, 0.545, 0.385, 0.615, ... ] U3 [0.0, 0.000, 1.000, 0.364, 0.636, ... ] U4 [0.0, 0.000, 0.000, 1.000, 0.231, ... ] U5 [0.0, 0.000, 0.000, 0.000, 1.000, ... ] .. [0.0, 0.000, 0.000, 0.000, 0.000, ... ]
  38. 38. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryRating Counting Matrix Rating Counting Matrix R1 R2 R3 R4 R5 R1 [ X11 X12 X13 X14 X15 ] R2 [ X21 X22 X23 X24 X25 ] R3 [ X31 X32 X33 X34 X35 ] R4 [ X41 X42 X43 X44 X45 ] R5 [ X51 X52 X53 X54 X55 ]
  39. 39. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryRating Counting Matrix BeanShell script (Users) BaseDataset ds = MusicData.createDataset(); Delphi delphi = new Delphi(ds,RecommendationType.USER_BASED); MusicUser mu1 = ds.pickUser("Bob"); delphi.findSimilarUsers(mu1); delphi.recommend(mu1);
  40. 40. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Item Similarity Matrix I1 I2 I3 I4 I5 ... I1 [1.0, 0.333, 0.385, 0.333, 0.364, ... ] I2 [0.0, 1.000, 0.545, 0.385, 0.615, ... ] I3 [0.0, 0.000, 1.000, 0.364, 0.636, ... ] I4 [0.0, 0.000, 0.000, 1.000, 0.231, ... ] I5 [0.0, 0.000, 0.000, 0.000, 1.000, ... ] .. [0.0, 0.000, 0.000, 0.000, 0.000, ... ]
  41. 41. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based BeanShell script (Items) Delphi delphi = new Delphi(ds,RecommendationType.ITEM_BASED); MusicUser mu1 = ds.pickUser("Bob"); delphi.recommend(mu1); MusicItem mi = ds.pickItem("La Bamba"); delphi.findSimilarItems(mi);
  42. 42. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Peruse the code Delphi UserBasedSimilarity ItemBasedSimilarity BaseSimilarityMatrix RatingCountMatrix
  43. 43. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Peruse the code Delphi UserBasedSimilarity ItemBasedSimilarity BaseSimilarityMatrix RatingCountMatrix
  44. 44. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Peruse the code Delphi UserBasedSimilarity ItemBasedSimilarity BaseSimilarityMatrix RatingCountMatrix
  45. 45. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Peruse the code Delphi UserBasedSimilarity ItemBasedSimilarity BaseSimilarityMatrix RatingCountMatrix
  46. 46. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryItem based Peruse the code Delphi UserBasedSimilarity ItemBasedSimilarity BaseSimilarityMatrix RatingCountMatrix
  47. 47. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  48. 48. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryText Parsing & Analysis No more ratings, what do we do? Now we deal with documents So, we need to define similarity based on the content of the documents Use Lucene’s StandardAnalyzer Build your own! (see CustomAnalyzer)
  49. 49. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryText Parsing & Analysis No more ratings, what do we do? Now we deal with documents So, we need to define similarity based on the content of the documents Use Lucene’s StandardAnalyzer Build your own! (see CustomAnalyzer)
  50. 50. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryText Parsing & Analysis No more ratings, what do we do? Now we deal with documents So, we need to define similarity based on the content of the documents Use Lucene’s StandardAnalyzer Build your own! (see CustomAnalyzer)
  51. 51. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryText Parsing & Analysis No more ratings, what do we do? Now we deal with documents So, we need to define similarity based on the content of the documents Use Lucene’s StandardAnalyzer Build your own! (see CustomAnalyzer)
  52. 52. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDocument representation No more ratings!
  53. 53. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDocument representation No more ratings!
  54. 54. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryDocument representation No more ratings!
  55. 55. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  56. 56. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryNetflix Prize Description Netflix prize More than 100 million ratings 480 thousand randomly-chosen, anonymous customers 18 thousand movie titles
  57. 57. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryNetflix Prize Description Netflix prize More than 100 million ratings 480 thousand randomly-chosen, anonymous customers 18 thousand movie titles
  58. 58. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryNetflix Prize Description Netflix prize More than 100 million ratings 480 thousand randomly-chosen, anonymous customers 18 thousand movie titles
  59. 59. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryNetflix Prize Description Netflix prize More than 100 million ratings 480 thousand randomly-chosen, anonymous customers 18 thousand movie titles
  60. 60. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryLessons learned Important considerations Data normalization Neighbor selection How many neighbors? Who are the ”best” neighbors? Neighbor weights ”Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique.”
  61. 61. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryLessons learned Important considerations Data normalization Neighbor selection How many neighbors? Who are the ”best” neighbors? Neighbor weights ”Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique.”
  62. 62. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryLessons learned Important considerations Data normalization Neighbor selection How many neighbors? Who are the ”best” neighbors? Neighbor weights ”Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique.”
  63. 63. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryLessons learned Important considerations Data normalization Neighbor selection How many neighbors? Who are the ”best” neighbors? Neighbor weights ”Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique.”
  64. 64. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize SummaryPresentation Outline 1 Introduction Recommendations in Action “It’s the Economy ...” Java source code 2 Basic Concepts The Online Music Store Example Similarity Distance (formulas) Similarity (formulas) The ”best” Similarity formula 3 Collaborative Filtering User based Rating Counting Matrix Item based 4 Content based
  65. 65. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Important considerations Business value validation - ”Long Tail”, ”niches to riches”, etc. Similarity metrics - Many to choose from, do not be afraid to explore! Collaborative Filtering: ”Show me your friend ...” User based Item based Content based recommendations - NLP challenges Large scale implementations - Speed, data size, quality
  66. 66. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Important considerations Business value validation - ”Long Tail”, ”niches to riches”, etc. Similarity metrics - Many to choose from, do not be afraid to explore! Collaborative Filtering: ”Show me your friend ...” User based Item based Content based recommendations - NLP challenges Large scale implementations - Speed, data size, quality
  67. 67. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Important considerations Business value validation - ”Long Tail”, ”niches to riches”, etc. Similarity metrics - Many to choose from, do not be afraid to explore! Collaborative Filtering: ”Show me your friend ...” User based Item based Content based recommendations - NLP challenges Large scale implementations - Speed, data size, quality
  68. 68. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Important considerations Business value validation - ”Long Tail”, ”niches to riches”, etc. Similarity metrics - Many to choose from, do not be afraid to explore! Collaborative Filtering: ”Show me your friend ...” User based Item based Content based recommendations - NLP challenges Large scale implementations - Speed, data size, quality
  69. 69. Introduction Basic Concepts Collaborative Filtering Content based Netflix Prize Summary Important considerations Business value validation - ”Long Tail”, ”niches to riches”, etc. Similarity metrics - Many to choose from, do not be afraid to explore! Collaborative Filtering: ”Show me your friend ...” User based Item based Content based recommendations - NLP challenges Large scale implementations - Speed, data size, quality
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×