Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

recommender_systems

291 views

Published on

  • Be the first to comment

  • Be the first to like this

recommender_systems

  1. 1. RECOMMENDER SYSTEM RAMIN SUMMER 2016
  2. 2. INTRODUCTION ● WHAT IS THE STORY WE’RE TRYING TO TELL AND WHY SHOULD BE AS EXCITED AS US ABOUT THIS PROJECT!
  3. 3. THE DATASET 3
  4. 4. Understanding the Data 4
  5. 5. FEATURE SELECTION 5 “Given all this data, how do we select only the relevant parts and discard the rest?”
  6. 6. DATA CAN BE TRICKY! 6
  7. 7. DATA CAN BE TRICKY! 7 Imbalanced categories!
  8. 8. WHICH OF THESE FEATURES IS USEFUL? 8
  9. 9. STARBUCKS - FILTERING DATA 9 All people, how much do you love starbucks? People who love starbucks, how old are you? Rating Age
  10. 10. THE RECOMMENDER SYSTEM (A SIMPLE EXAMPLE) 10
  11. 11. Linear Regression: Out Trained Model - Each data point represents how much a user likes item ‘X’ based on his/her income income rating 11
  12. 12. HOW DO WE MAKE PREDICTIONS? 12
  13. 13. THE RECOMMENDER SYSTEM PERSONALIZATION 13
  14. 14. DRINK RECOMMENDER 14 Story: After spending 8 years studying engineering in college, I decided to become a bartender. I kept notes of what my customers liked and disliked. I want to create a recommender system to suggest and create new drinks that my customer will most likely buy.
  15. 15. GETTING TO KNOW MY MAKE-BELIEVE CUSTOMER ANALYZING DATA 15 ? ? ? ? ? ? ? ? ? ? ? ? ? User info itemsUser
  16. 16. Miranda: Likes more dessert-y type drinks. Not into bitter drinks. Can’t tell if Miranda is rich. Sansa: She likes Martini and hates everything else? Don’t know much about her. Cersie: Only into strong coffee drinks. She likes alcoholic drinks and she’s rich. ? ? ? ?? ? 16
  17. 17. I CAN SPEND HOURS, DAYS, MONTHS, YEARS AND PERHAPS DECADES ANALYZING THE DATA MANUALLY. I NEED TO FIND A WAY TO FACTOR INFORMATION ABOUT MY CUSTOMERS, DRINKS AND INTERACTION BETWEEN THEM TO AUTOMATE THE RECOMMENDING PROCESS USING A COMPUTER. 17 INFORMATION OVERLOAD I should’ve taken that machine learning course...
  18. 18. PERSONALIZATION 18 Connecting user to items User | Movies User | Products User | Music Is browsing every movie, product, or music even practical? “Need new ways to discover content”.
  19. 19. PERSONALIZATION 19 ● User to Drinks? ● User to User? ● User to medicine? Connecting user to items User | Movies User | Products User | Music
  20. 20. PERSONALIZING DRINKS 20 Bartender's note: Enio ordered a Martini last weekend in the evening. He ordered a coffee on Monday at noon. A good recommender system adapt with time and is capable of considering multiple sessions.
  21. 21. POPULAR RECOMMENDER (LEVEL 0) 21 Count: Item: Popular recommender, recommends the most popular item on the menu, Espresso. ● Completely lacks personalization. My customers have different taste and they want drinks that match their interests.
  22. 22. CLASSIFICATION MODEL (LEVEL 1) 22 User info Item info Past history Others Classification model Yes, Miranda will probably like Mocha. No, Miranda will probably dislike Mocha. Pros : Personalized. Capture context. Cons: We don’t have access to all these info and if we input wrong information, we will make wrong predictions.
  23. 23. COLLABORATIVE FILTERING (LEVEL 2) 23 People who liked X also liked Y. If Ramin likes X, he might also like … Y. Co-occurrence Matrix (A symmetrix item-item matrix) Item Espresso Martini Mocha Espresso a 3 2 Martini 3 b 7 Mocha 2 7 c Number of people purchase both Mocha and Espresso Jose bought Mocha the other day, what should I recommend him to now? Martini
  24. 24. COLLABORATIVE FILTERING (LEVEL 2) 24 Item Espresso Martini Mocha Coffee Espresso a 3 2 45 Martini 3 b 7 23 Mocha 2 7 c 39 Coffee 45 23 39 d Popular Item Effect No matter what Jose has purchased, the recommender system will recommend coffee. There are ways to normalize the data to avoid the popular item effect.
  25. 25. COLLABORATIVE FILTERING (LEVEL 2) 25 Item Espresso Martini Mocha Coffee Espresso a 3 2 45 Martini 3 b 7 23 Mocha 2 7 c 39 Coffee 45 23 39 d No History I’m only looking at Jose most current purchase of Mocha. What if he bought a Martini before and he didn’t like it?
  26. 26. COLLABORATIVE FILTERING (LEVEL 2) 26 Item Espresso Martini Mocha Coffee Espresso a/N 3/N 2/N 45/N Martini 3/N b/N 7/N 23/N Mocha 2/N 7/N c/N 39/N Coffee 45/N 23/N 39/N d/N Weighted Average of purchased items I want to know if I should recommend coffee to Jose. Score_coffee = ⅓(S_coffee,mocha + S_coffee,martini + S_coffee,espresso) Sort the scores and pick the one with highest score. N: Normalizing factor Purchase History Jose: Mocha - YES Martini - NO
  27. 27. COLLABORATIVE FILTERING (LEVEL 2) 27 More problems!!! ● Can’t utilize context like time of the day. ● Can’t utilize my customer age to my advantage. ● Can’t utilize information about my drinks and their ingredients to make better recommendation. And WHAT IF: I have a new customer, what should I recommend? What if I’m making a new drink, who would buy it? A Cold Start Problem.
  28. 28. NEW ITEMS 28 New drink: Coffee Martini 4 - Alcoholic drink 2 - rating Coffee Drink Miranda: 1.8 avg - Alcoholic drink 3.5 avg - Coffee Drink Miranda & Coffee Martini Rating = 4*1.8 + 2*3.5 = 14.2 Enio & Coffee Martini Rating = 4*4.8 + 2*1.5 = 22.2 Enio: 4.8 avg - Alcoholic drink 1.5 avg - Coffee Drink
  29. 29. MATRIX FACTORIZATION (LEVEL 3) 29 ● We need a recommender system that factors more than just users past purchase history. ● A system that can factor more personalized info about the user and the product and the time of the visit into account as well plus all the goodies we get from the collaborative filtering. To learn from data even when not available. (Very sparse matrices, missing data.) Keep in mind that each user only tries a few drinks. Rating = Users ItemsRating available from user U for item V Rating unavailable We don’t know what the user thinks about item V
  30. 30. MATRIX FACTORIZATION (LEVEL 3) 30 Rating = Users Items We need to fill in the white boxes using ALL of the available info we got. Dictionary (Bases) Activation Matrix (Encodings) x ~ A whole lot of fancy math happens here to factorize the rating matrix as multiplication of two other smaller matrices that uncover those hidden area by minimizing some cost function.
  31. 31. BLENDING MODELS (LEVEL 4) 31 Point : There is no universal recommender systems that work for everything. We need to blend different models to be able to attack different applications. Netflix Challenge From 100 million movie ratings rate 3 millions of them to highest accuracy. Winning team blended over 100 models to gain 10.35% improvement in the accuracy. (and got 1 million dollars for it!)
  32. 32. PERFORMANCE METRIC RECOMMENDER SYSTEMS 32 RMS ● Fraction of items correctly recommended. BUT: ● We care about what the user liked more than what they don’t. ● Imbalanced information can skew the results. ● With this metric you can get good accuracies by recommending nothing at all! Recall (# liked & # shown) / # liked ● How many of the items that the user liked, was actually recommended? The world we’re looking at only contains the liked items. Precision (# liked & # shown) / # shown ● Out of the recommended items, how many items did the user actually liked? The world is all the recommended items. How much garbage should I look at until I found what I like (attention span).
  33. 33. OPTIMAL RECOMMENDER 33 Recall (# liked & # shown) / # liked Maximize Recall => Recommend everything! Recall = 1 But, Precision = very small Precision (# liked & # shown) / # shown Best Recommender would recommend only the products the user like. Precision = 1 Recall = 1 Point: Use both precision and recall among other metrics RMS to evaluate your recommender system.
  34. 34. PRECISION - RECALL CURVE VARY #ITEMS 34 Optimal Recommender: ● Precision stays at 1, because it’s only recommending what I like. ● Recall increases, because it’s uncovering more of the items that the user like as we are increasing the threshold on the number of items. One liked item recommended. 1 2 3 4 …./ total number of interests Recall Precision All liked item are recommended.
  35. 35. PRECISION - RECALL CURVE 35 A more realistic recommender: As we recommend more items to the user, the area below the precision-recall curve drops, we start introducing garbage! 1 2 3 4 …./ total number of interests Recall Precision Realistic recommender (smoothed out)
  36. 36. PRECISION - RECALL CURVE 36 A more realistic recommender: As we recommend more items to the user, the area below the precision-recall curve drops, we start introducing garbage! Based on the application, you could also look at some weighted average of these metrics as well. For now, we can look at the area below the curve to choose which recommender system to choose. 1 2 3 4 …./ total number of interests Recall Precision Orange curve is a better recommender system than the red one.
  37. 37. DRINK RECOMMENDER - AN EVALUATION 37 Matrix Factorization and Similarity based Recommender (with item info+ with/out user info) Similarity based recommender Matrix factorization recommender
  38. 38. Training Validation Testing Building Blocks Recommender System Raw Data User info User Item Rating Item info Train Test Validation Feature Selection Re-Format Recommender System Tune EvaluatePreprocessing the data Clean up your data! 38 Processed Data
  39. 39. BACK TO OUR AARP SERVICE RECOMMENDER SYSTEM 39 Data and Training: ● 100 participants. ● 46 questions ● Imbalanced demographic classes ● Missing data less than 1% ● Training with 80% of the data ● Tuning with cross validation ● Testing with 20% of the data User info - Salary User info - Age Item info- Outdoors
  40. 40. USER INFO AND ITEM INFO 40 Popular Recommender Similarity Recommender Matrix Factorization Recommender
  41. 41. ITEM INFO 41
  42. 42. USER INFO 42 AHA MOMENT! What is going on???
  43. 43. NO USER DATA NO ITEM DATA 43 RMS 2.61 1.58 1.63 Note the precision range!
  44. 44. 44

×