People who liked this talk also liked …Building Recommendation Systems             Using Ruby            Ryan Weald, @rwea...
Who is this guy? What does he knowabout recommendation       systems?                       2
Data Scientist @Sharethrough Native advertising     platform                               3
4
Outline1) What is a recommendation system?2) Collaborative filtering based   recommendations3) Content based recommendation...
What this Talk is Not• Everything there is to know about  recommendation systems.• Bleeding edge machine learning• How to ...
What is arecommendation system?                         7
A program that predictsa user’s preferences using information about the user, other users, and the         items in your s...
LinkedIn           9
Netflix         10
Spotify          11
Amazon         12
How do I buildrecommendations?                   13
Two Main Categories of Algorithm1. Collaborative Filtering (CF)2. Content Based - Classification                           ...
Collaborative FilteringFill in missing user preferences using         similar users or items                              ...
Two Types of CF1. Memory Based - Uses similaritybetween users or items. Datasetusually kept in memory2. Model Based - Mode...
User Based CF (User x Item) Matrix + SimilarityFunction = Top-K most similar users                                      17
Collaborative Filtering         Video 1    Video 2   Video 3      Video 4   Video 5User 1      0          1          0    ...
Similarity Functions• Pearson Correlation Coefficient• Cosine Similarity                                   19
Pearson Correlation Coefficient                                 20
Calculating PCC                  21
Calculating PCC                  22
Calculating PCC                  23
Calculating PCC                  24
Calculating PCC                  25
Calculating PCC                  26
27
Using similarity torecommend items                      28
Collaborative Filtering         Video 1    Video 2   Video 3      Video 4   Video 5User 1      0          1          0    ...
30
Problems With CF• Cold Start• Data Sparsity• Resource expensive                        31
Doesn’t the videocontent matter forrecommendations?                     32
Content Based Recommendations  Classify items based on features of   the item. Pick other items from      same class to re...
Content Based Algorithms• K-means clustering• Random Forrest• Support Vector Machines• ...• Insert your favorite ML algori...
Content Based Algorithms          Type of    Duration   Maturity          content                RatingVideo 1   comedy   ...
K-means Clustering  Group items into K clusters.Assign new item to a cluster and  pick items from that cluster            ...
K-means Clustering                     37
Problems With Content Based      Recommendations• Unsupervised Learning is hard• Training data limited or expensive• Doesn...
Hybrid RecommendationsCombine collaborative filtering withcontent based algorithm to achieve          greater results      ...
Hybrid RecommendationsInput           CF Based         Recommender                         Combiner   RecoInput         Co...
Hybrid Recommendations                         41
Hybrid Recommendations            Content         CFInput                                 Reco          Recommender   Reco...
Hybrid Recommendations            CF        RecommenderInput                        Reco          Content        Recommend...
Evaluating Recommendation Quality• Precision vs. Recall• Clicks• Click through rate• Direct user feedback                 ...
Precision vs. Recall                       45
Precision vs. Recall                       46
Summary of What We’ve Learned • Collaborative Filtering using similar users • Content clustering using k-means • Combining...
Don’t Reinvent the Wheel• Apache Mahout• JRuby mahout gem• SciRuby• Recommenderlab for R                             48
Resources & Further Reading• Recommender Systems: An Introduction• Linden, Greg, Brent Smith, and Jeremy York."Amazon. com...
We’re Hiringhttp://bit.ly/str-engineering                                50
Thanks!        Twitter: @rwealdEmail: ryan@sharethrough.com                               51
Upcoming SlideShare
Loading in …5
×

People who liked this talk also liked … Building Recommendation Systems Using Ruby

2,202 views

Published on

From Amazon, to Spotify, to thermostats, recommendation systems are everywhere. The ability to provide recommendations for your users is becoming a crucial feature for modern applications. In this talk I'll show you how you can use Ruby to build recommendation systems for your users. You don't need a PhD to build a simple recommendation engine -- all you need is Ruby. Together we'll dive into the dark arts of machine learning and you'll discover that writing a basic recommendation engine is not as hard as you might have imagined. Using Ruby I'll teach you some of the common algorithms used in recommender systems, such as: Collaborative Filtering, K-Nearest Neighbor, and Pearson Correlation Coefficient. At the end of the talk you should be on your way to writing your own basic recommendation system in Ruby.

Published in: Technology
1 Comment
5 Likes
Statistics
Notes
No Downloads
Views
Total views
2,202
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
34
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide

People who liked this talk also liked … Building Recommendation Systems Using Ruby

  1. 1. People who liked this talk also liked …Building Recommendation Systems Using Ruby Ryan Weald, @rweald LA RubyConf 2013 1
  2. 2. Who is this guy? What does he knowabout recommendation systems? 2
  3. 3. Data Scientist @Sharethrough Native advertising platform 3
  4. 4. 4
  5. 5. Outline1) What is a recommendation system?2) Collaborative filtering based recommendations3) Content based recommendations4) Hybrid systems - the best of both worlds5) Evaluating your recommendation system6) Resources & existing libraries 5
  6. 6. What this Talk is Not• Everything there is to know about recommendation systems.• Bleeding edge machine learning• How to use a specific library 6
  7. 7. What is arecommendation system? 7
  8. 8. A program that predictsa user’s preferences using information about the user, other users, and the items in your system. 8
  9. 9. LinkedIn 9
  10. 10. Netflix 10
  11. 11. Spotify 11
  12. 12. Amazon 12
  13. 13. How do I buildrecommendations? 13
  14. 14. Two Main Categories of Algorithm1. Collaborative Filtering (CF)2. Content Based - Classification 14
  15. 15. Collaborative FilteringFill in missing user preferences using similar users or items 15
  16. 16. Two Types of CF1. Memory Based - Uses similaritybetween users or items. Datasetusually kept in memory2. Model Based - Model generatedto “explain” observed ratings 16
  17. 17. User Based CF (User x Item) Matrix + SimilarityFunction = Top-K most similar users 17
  18. 18. Collaborative Filtering Video 1 Video 2 Video 3 Video 4 Video 5User 1 0 1 0 5 0User 2 1 2 1 0 5User 3 2 5 0 0 2User 4 5 4 4 1 1User 5 2 4 2 ? ? * 0 denotes not rated 18
  19. 19. Similarity Functions• Pearson Correlation Coefficient• Cosine Similarity 19
  20. 20. Pearson Correlation Coefficient 20
  21. 21. Calculating PCC 21
  22. 22. Calculating PCC 22
  23. 23. Calculating PCC 23
  24. 24. Calculating PCC 24
  25. 25. Calculating PCC 25
  26. 26. Calculating PCC 26
  27. 27. 27
  28. 28. Using similarity torecommend items 28
  29. 29. Collaborative Filtering Video 1 Video 2 Video 3 Video 4 Video 5User 1 0 1 0 5 0User 2 1 2 1 0 5User 3 2 5 0 0 2User 4 5 4 4 1 1User 5 2 4 2 ? ? * 0 denotes not rated 29
  30. 30. 30
  31. 31. Problems With CF• Cold Start• Data Sparsity• Resource expensive 31
  32. 32. Doesn’t the videocontent matter forrecommendations? 32
  33. 33. Content Based Recommendations Classify items based on features of the item. Pick other items from same class to recommend. 33
  34. 34. Content Based Algorithms• K-means clustering• Random Forrest• Support Vector Machines• ...• Insert your favorite ML algorithm 34
  35. 35. Content Based Algorithms Type of Duration Maturity content RatingVideo 1 comedy 60 GVideo 2 action 120 GVideo 3 comedy 34 PG-13Video 4 romantic 15 RVideo 5 sports 120 G 35
  36. 36. K-means Clustering Group items into K clusters.Assign new item to a cluster and pick items from that cluster 36
  37. 37. K-means Clustering 37
  38. 38. Problems With Content Based Recommendations• Unsupervised Learning is hard• Training data limited or expensive• Doesn’t take user into account• Limited by features of content 38
  39. 39. Hybrid RecommendationsCombine collaborative filtering withcontent based algorithm to achieve greater results 39
  40. 40. Hybrid RecommendationsInput CF Based Recommender Combiner RecoInput Content Based Recommender 40
  41. 41. Hybrid Recommendations 41
  42. 42. Hybrid Recommendations Content CFInput Reco Recommender Recommender 42
  43. 43. Hybrid Recommendations CF RecommenderInput Reco Content Recommender 43
  44. 44. Evaluating Recommendation Quality• Precision vs. Recall• Clicks• Click through rate• Direct user feedback 44
  45. 45. Precision vs. Recall 45
  46. 46. Precision vs. Recall 46
  47. 47. Summary of What We’ve Learned • Collaborative Filtering using similar users • Content clustering using k-means • Combining 2 algorithms to boost quality • How to evaluate your recommender 47
  48. 48. Don’t Reinvent the Wheel• Apache Mahout• JRuby mahout gem• SciRuby• Recommenderlab for R 48
  49. 49. Resources & Further Reading• Recommender Systems: An Introduction• Linden, Greg, Brent Smith, and Jeremy York."Amazon. com recommendations: Item-to-itemcollaborative filtering."• Resnick, Paul, et al. "GroupLens: an open architecturefor collaborative filtering of netnews."• ACM RecSys Conference Proceedings 49
  50. 50. We’re Hiringhttp://bit.ly/str-engineering 50
  51. 51. Thanks! Twitter: @rwealdEmail: ryan@sharethrough.com 51

×