Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tutorial on sequence aware recommender systems - UMAP 2018

Tutorial on Sequence-Aware Recommender Systems

  • Login to see the comments

Tutorial on sequence aware recommender systems - UMAP 2018

  1. 1. Sequence-Aware Recommender Systems Massimo Quadrana, Pandora Media, USA mquadrana@pandora.com Paolo Cremonesi, Politecnico di Milano paolo.cremonesi@polimi.it Dietmar Jannach, AAU Klagenfurt, Austria dietmar.jannach@aau.at
  2. 2. About today’s presenter(s) • Paolo Cremonesi – Politecnico di Milano, Italy • Dietmar Jannach – University Klagenfurt, Austria • Massimo Quadrana – Pandora, Italy 2
  3. 3. • M. Quadrana, P. Cremonesi, D. Jannach, "Sequence-Aware Recommender Systems“ ACM Computing Surveys, 2018 • Link to paper: bit.ly/sequence-aware-rs Today’s tutorial based on 3
  4. 4. About you? 4
  5. 5. Agenda • 13:30 – 15:00 Part 1 – Introduction, Problem definition • 15:00 – 15:30 Coffee break • 15:30 - 16:15 Part 2 – Algorithms • 16:15 – 16:45 Part 3 – Evaluation • 16:45 – 17:00 – Closing, Discussion 5
  6. 6. Part I: Introduction
  7. 7. Recommender Systems • A central part of our daily user experience – They help us locate potentially interesting things – They serve as filters in times of information overload – They have an impact on user behavior and business 7
  8. 8. Recommendations everywhere 8
  9. 9. A field with a tradition The last 18 years in Recsys research: • 2000 recommendation = matrix competition • 2010 recommendation = learning to rank 9
  10. 10. Common problem abstraction (1): matrix completion • Goal – Learn missing ratings • Quality assessment – Error between true and estimated ratings Item1 Item2 Item3 Item4 Item5 Alice 5 ? 4 4 ? Paolo 3 1 ? 3 3 Susan ? 3 ? ? 5 Bob 3 ? 1 5 4 George ? 5 5 ? 1 10
  11. 11. Common problem abstraction (2): learning to rank • Goal – Learn relative ranking of items • Quality assessment – Error between true and estimated ranking 11
  12. 12. A field with a tradition • But are we focusing on the most important problems? 12
  13. 13. Real-world problem situations • User intent • Short-term intent/context vs. long term taste • Order can matter • Order matters (constraints) • Interest drift • Reminders • Repeated purchases 13
  14. 14. User intent • Our user searched and listened to “Last Christmas” by Wham! • Should we, … – Play more songs by Wham!? – More pop Christmas songs? – More popular songs from the 1980s? – Play more songs with controversial user feedback? 14
  15. 15. User intent • Knowing the user’s intention can be crucial 15
  16. 16. • Here’s what the customer purchased during the last weeks • Now, the user return to the shop and browse these items Short-term intent/context vs. long term taste 16
  17. 17. Short-term intent/context vs. long term taste • What to recommend? • Some plausible options – Only shoes – Mostly Nike shoes – Maybe also some T-shirts 17 products purchased in the past products browsed in the current session
  18. 18. Short-term intent/context vs. long term taste • Using the matrix completion formulation – One trains a model based only on past actions – Without the context, the algorithm will probably most recommend – Mostly (Nike) T-shirts and some trousers – Is this what you expect? 18 products purchased in the past products browsed in the current session
  19. 19. Order can matter • Next rack recommendation • What to recommend next should suit the previous tracks 19
  20. 20. Order matters (order constraints) • Is it meaningful to recommend Star Wars III, if the user has not seen the previous episodes? 20
  21. 21. Before having a family After having a family Interest drift 21
  22. 22. Reminders • Should we recommend items that the user already knows? • Amazon does 22
  23. 23. Repeated purchases • When should be remind users (through recommendations?) 23
  24. 24. • The choice of the best recommender algorithm depends on – Goal (user task, application domain, context, …) – Data available – Target quality Algorithms depends on … 24
  25. 25. • Movie recommendation – if you watched a SF movie … Examples based on goal … 25
  26. 26. • Movie recommendation – if you watched a SF movie … – … you recommend another SF movie • Algorithm: most similar item Examples based on goal … 26
  27. 27. • On-line store recommendations – if you bought a coffee maker … Examples based on goal … 27
  28. 28. • On-line store recommendations – if you bought a coffee maker … – … you recommend cups or coffee beans (not another coffee maker) • Algorithm: frequently bought together user who bought … also bought Examples based on goal … 28
  29. 29. • If you have – all items with many ratings – no item attributes • You need CF • If you have – no ratings – all item with many attributes • You need CBF Example based on data … 29
  30. 30. • Goal – next track recommendation, … • Data – new users, … • Need for quality – context, intent, … Examples for sequence-aware algorithms … 30
  31. 31. Implications for research • Many of the scenarios cannot be addressed by a matrix completion or learning-to-rank problem formulation – No short-term context – Only one single user-item interaction pair – Often only one type of interaction (e.g., ratings) – Rating timestamps sometimes exist, but might be disconnected from actually experiencing/using the item 31
  32. 32. • A family of recommenders that – uses different input data, – often bases the recommendations on certain types of sequential patterns in the data, – addresses the mentioned practical problem settings Sequence-Aware Recommenders 32
  33. 33. Problem Characterization 33
  34. 34. Characterizing Sequence-Aware Recommender Systems 34
  35. 35. Inputs 35 • Ordered (and time-stamped) set of user actions – Known or anonymous (beyond the session)
  36. 36. Inputs 36 • Actions are usually connected with items – Exceptions: search terms, category navigation, …
  37. 37. Inputs 37
  38. 38. Inputs 38
  39. 39. Inputs 39
  40. 40. Inputs 40
  41. 41. Inputs 41 • Different types of actions – Item purchase/consumption, item view, add-to- catalog, add-to-wish-list, …
  42. 42. Inputs 42 • Attributes of actions: user/item details, context – Dwelling times, item discounts, etc.
  43. 43. Inputs 43 • Typical: Enriched clickstream data
  44. 44. Inputs: differences with traditional RSs 44 • for each user, all sequences contain one single action (item)
  45. 45. Inputs: differences with traditional RSs (with time-stamp) 45 • for each user, there is one single sequence with several actions (item)
  46. 46. Output (1) 47 • One (or more) ordered list of items • The list can have different interpretations, based on goal, domain, application scenario • Usual item-ranking tasks – list of alternatives for a given item – complements or accessories
  47. 47. Output (2) 48 • An ordered list of items • The list can have different interpretations, based on goal, domain, application scenario • Suggested sequence of actions – next-track music recommendations
  48. 48. Output (3) 49 • An ordered list of items • The list can have different interpretations, based on goal, domain, application scenario • Strict sequence of actions – course learning recommendations
  49. 49. Typical computational tasks • Find sequence-related in the data, e.g., – co-occurrence patterns – sequential patterns – distance patterns • Reasoning about order constraints – weak and strong constraints • Relate patterns with user profile and current point in time – e.g., items that match the current session 51
  50. 50. Abstract problem characterization: Item-ranking or list-generation • Some definitions – 𝑈 : users – 𝐼 : items – 𝐿 : ordered list of items of length 𝑘 – 𝐿∗ : set of all possible lists 𝐿 of length up to 𝑘 – 𝑓 𝑢, 𝐿 : utility function, with 𝑢 ∈ 𝑈 and 𝐿 ∈ 𝐿∗ 𝐿 𝑢 = argmax 𝐿∈𝐿∗ 𝑓 𝑢, 𝐿 𝑢 ∈ 𝑈 – Task: learn 𝑓 𝑢, 𝐿 from sequences 𝐴 of past user actions 53
  51. 51. Abstract problem characterization • Utility function is not limited to scoring individual items • The utility of entire lists can be assessed – including, e.g., transition between objects, fulfillment of order constraints, diversity aspects • The design of the utility function depends on the purpose of the system – e.g., provide logical continuation, show alternatives, show accessories, … Jannach, D. and Adomavicius, G.: "Recommendations with a Purpose". In: Proceedings of the 10th ACM Conference on Recommender Systems (RecSys 2016). Boston, Massachusetts, USA, 2016, pp. 7-10 54
  52. 52. On recommendation purposes • Often, researchers are not explicit about the purpose – Traditionally, could be information filtering or discovery, with conflicting goals • Commonly used abstraction – e.g., predict hidden rating • For sequence-aware recommenders – often: predict next (hidden) action, e.g., for a given session beginning 55
  53. 53. Relation to other areas • Implicit feedback recommender systems – Sequence-aware recommenders are often built on implicit feedback signals (action logs) – Problem formulation is however not based on matrix completion • Context-aware recommender systems – Sequence-aware recommenders often are special forms of context-aware systems – Here: Interactional context is relevant, which is only implicitly defined through the user’s actions 56
  54. 54. • Time-Aware RSs – Sequence-aware recommenders do not necessarily need explicit timestamps – Time-Aware RSs use explicit time • e.g., to detect long-term user drifts • Other: – interest drift – user-modeling Relation to other areas 57
  55. 55. Categorization 58
  56. 56. Categorization of tasks • Four main categories – Context adaptation – Trend detection – Repeated recommendation – Consideration of order constraints and sequential patterns • Notes – Categories are not mutually exclusive – All types of problems based on the same problem characterization, but with different utility functions, and using the data in different ways 59
  57. 57. Context adaptation • Traditional context-aware recommenders are often based on the representational context – defined set of variables and observations, e.g., weather, time of the day etc. • Here, the interactional context is relevant – no explicit representation of the variables – contextual situation has to be inferred from user actions 60
  58. 58. How much past information is considered? • Last-N interactions based recommendation: – Often used in Next-Point-Of-Interest recommendation scenarios – In many cases only the very last visited location is considered – Also: “Customers who bought … also bought” • Reasons to limit oneself: – Not more information available – Previous information not relevant 61
  59. 59. How much past information is considered? • Session-based recommendation: – Short-term only – Only last sequence of actions of the current user is known – User might be anonymous • Session-aware recommendation: – Short-term + Long-term – In addition, past session of the current user are known – Allows for personalized session-based recommendation Quadrana, M., Karatzoglou, A., Hidasi, B., Cremonesi, P.: “Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks”. Proceedings ACM RecSys 2017: 130-137 62
  60. 60. Anonym 1 Anonym 2 Time Session-based recommendation Anonym 3 63
  61. 61. Soccer Anonym 1 Anonym 2 Time Session-based recommendation Cartoons NBA Anonym 3 64
  62. 62. User 1 User 2 Time Session-aware recommendation User 1 Soccer Cartoons NBA 65
  63. 63. User 1 Time Session-aware recommendation User 1 Sports! Soccer NBA 66
  64. 64. User 1 Time Session-aware recommendation Capcakes User 1 Sports! Soccer 67
  65. 65. What to find? • Next • Alternatives • Complements • Continuations 68
  66. 66. What to pick • One • All 69
  67. 67. Trend detection • Less explored than context adaptation • Community trends: – Consider the recent or seasonal popularity of items, e.g., in the fashion domain and, in particular, in the news domain • Individual trends: – E.g., natural interest drift • Over time, because of influence of other people, because of a recent purchase, because something new was discovered (e.g., a new artist) Jannach, D., Ludewig, M. and Lerche, L.: "Session-based Item Recommendation in E-Commerce: On Short- Term Intents, Reminders, Trends, and Discounts". User-Modeling and User-Adapted Interaction, Vol. 27(3- 5). Springer, 2017, pp. 351-392 70
  68. 68. Repeated Recommendation • Identifying repeated user behavior patterns: – Recommendation of repeat purchases or actions • E.g., ink for a printer, next app to open after call on mobile – Patterns can be mined from the individual or the community as a whole • Repeated recommendations as reminders – Remind users of things they found interesting in the past • To remind the of things they might have forgotten • As navigational shortcuts, e.g., in a decision situation • Timing as an interesting question in both situations 71
  69. 69. Consideration of Order Constraints and Observed Sequential Patterns • Two types of sequentiality information – External domain knowledge: strict or weak ordering constraints • Strict, e.g., sequence of learning courses • Weak, e.g., when recommending sequels to movies – Information that is mined from the user behavior • Learn that one movie is always consumed after another • Predict next web page, e.g., using sequential pattern mining techniques 72
  70. 70. Review of existing works (1) • 100+ papers reviewed – Focus on last N interactions common • But more emphasis on session-based / session-aware approaches in recent years 73
  71. 71. Review of existing works (2) • 100+ papers reviewed – Very limited work for other problems: • Repeated recommendation, trend detection, consideration of constraints 74
  72. 72. Review of existing works (3) • Application domains 75
  73. 73. Summary of first part • Matrix completion abstraction not well suited for many practical problems • In reality, rich user interaction logs are available • Different types of information can be derived from the sequential information in the logs – and used for special recommendation tasks, in particular for the prediction of the next-action • Coming next: – Algorithms and evaluation 76
  74. 74. Part II: Algorithms
  75. 75. Agenda • 13:30 – 15:00 Part 1 – Introduction, Problem definition • 15:00 – 15:30 Coffee break • 15:30 - 16:15 Part 2 – Algorithms • 16:15 – 16:45 Part 3 – Evaluation • 16:45 – 17:00 – Closing, Discussion 78
  76. 76. Taxonomy • Sequence Learning – Frequent Pattern Mining – Sequence Modeling – Distributed Item Representations – Supervised Models with Sliding Window • Sequence-aware Matrix Factorization • Hybrids – Factorized Markov Chains – LDA/Clustering + sequence learning • Others – Graph-based, Discrete-optimization 79
  77. 77. Sequence Learning (SL) • Useful in application domains where input data has an inherent sequential nature – Natural Language Processing – Time-series prediction – DNA modelling – Sequence-Aware Recommendation 80
  78. 78. SL: Frequent Pattern Mining (FPM) 1. Discover user consumption patterns • e.g. Association rules, (Contiguous) Sequential Patterns) 2. Look for patterns matching partial transactions 3. Rank items by confidence of matched rules • FPM Applications – Page prefetching and recommendations Personalized FPM for next-item recommendation Next-app prediction 81
  79. 79. SL: Frequent Pattern Mining (FPM) • Pros – Easy to implement – Explainable predictions • Cons – Choice of the minimum support/confidence thresholds – Data sparsity – Limited scalability 82
  80. 80. SL: Sequence Modeling • Sequences of past user actions as time series with discrete observations – Timestamps used only to order user actions • SM aim to learn models from past observations (user actions) to predict future ones • Categories of SM models – Markov Models – Reinforcement Learning – Recurrent Neural Networks 83
  81. 81. SL: Sequence Modeling Markov Models • Stochastic processes over discrete random variables – Finite history (= order of the model)  user actions depend on a limited # of most recent actions • Applications – Online shops – Playlist generation – Variable Order Markov Models for news recommendation – Hidden Markov Models for contextual next track prediction 84
  82. 82. SL: Sequence Modeling Reinforcement Learning • Learn by sequential interactions with the environment • Generate recommendations (actions) and collect user feedback (reward) • Markov Decisions Processes (MDPs) • Applications – Online e-commerce services – Sequential relationships between the attributes of items explored in a user session 85
  83. 83. SL: Sequence Modeling Recurrent Neural Networks (RNN) • Distributed real-valued hidden state models with non-linear dynamics – Hidden state: latent representation of user state within/across sessions – Update the hidden state on the current input and its previous value, then use it to predict the probability for the next action • Trained end-to-end with gradient descent • Applications – Next-click prediction with RNNs 86
  84. 84. SL: Distributed Item Representations • Dense, lower-dimensional representations of the items – derived from sequences of events – preserve the sequential relationships between items • Similar to latent factor models – “similar” items are projected into similar vectors – every item is associated with a real-valued embedding vector – its projection into a lower-dimensional space – certain item transition properties are preserved • e.g., co-occurrence of items in similar contexts 87
  85. 85. SL: Distributed Item Representations • Different approaches are possible – Prod2Vec – Latent Markov Embedding (LME) – … 88
  86. 86. SL: Distributed Item Representations - Prod2Vec vi = vector representing item i ik = item at step k of the session • Learn vectors vi in order to maximize the joint probabilities of having items ik-2 , ik-1 , ik+1 , ik+2 given item ik • Sliding window of size N – e.g., N = 5 k–2 k–1 k k+1 k+2 89
  87. 87. SL: Supervised Learning with Sliding Windows 90
  88. 88. Sequence-aware Matrix Factorization • Sequence information usually derived from timestamps  Time-aware recommender systems 91
  89. 89. Hybrid Methods • Sequence Learning + Matrix Completion (CF or CBF) 92
  90. 90. Statistics 0 2 4 6 8 10 12 14 93
  91. 91. Going deep: FPM 94
  92. 92. FPM • Association Rule Mining – items co-occurring within the same sessions • no check on order • if you like A and B, you also like C (aka: learning to rank) • Sequential Pattern Mining – Items co-occurring in the same order • no check on distance • If you watch A and later watch B, you will later watch C • Contiguous Sequential Pattern Mining – Item co-occurring in the same order and distance • If you watch A and B one after the other, if now watch C 95
  93. 93. FPM • Two steps approach 1. Offline: rule mining 2. Online: rule matching (with current user session) • Rules have – Confidence: conditional probability – Support: number of examples (main parameter) • lower thresholds -> fewer rules – few rules: difficult to find rules matching a session – many rules: noisy rules (low quality) 96
  94. 94. FPM • More constrained FPM methods produce fewer rules and are computationally more complex – CSPM -> SPM -> ARM • Efficiency and number of mined rules decrease with the number N of past user actions (items) considered in the mining – N=1 : users who like A also like B – N=2 : users who like A and B also like C – … 97
  95. 95. Going deep: RNN 98
  96. 96. Simple Recurrent Neural Network • Hidden state  used to predict the output – Computed from next input and previous hidden state 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 99
  97. 97. Simple Recurrent Neural Network • Hidden state  used to predict the output – Computed from next input and previous hidden state • Three weight matrices – ℎ 𝑡 = 𝑓 𝑥 𝑇 𝑊𝑥 + ℎ 𝑡−1 𝑇 𝑊ℎ + 𝑏ℎ – 𝑦𝑡 = 𝑓 ℎ 𝑡 𝑇 𝑊𝑦 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 100
  98. 98. Simple Recurrent Neural Network • Hidden state  used to predict the output – Computed from next input and previous hidden state • Three weight matrices – ℎ 𝑡 = 𝑓 𝑥 𝑇 𝑊𝑥 + ℎ 𝑡−1 𝑇 𝑊ℎ + 𝑏ℎ – 𝑦𝑡 = 𝑓 ℎ 𝑡 𝑇 𝑊𝑦 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 101
  99. 99. Simple Recurrent Neural Network • Hidden state  used to predict the output – Computed from next input and previous hidden state • Three weight matrices – ℎ 𝑡 = 𝑓 𝑥 𝑇 𝑊𝑥 + ℎ 𝑡−1 𝑇 𝑊ℎ + 𝑏ℎ – 𝑦𝑡 = 𝑓 ℎ 𝑡 𝑇 𝑊𝑦 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 102
  100. 100. Simple Recurrent Neural Network • Hidden state  used to predict the output – Computed from next input and previous hidden state • Three weight matrices – ℎ 𝑡 = 𝑓 𝑥 𝑇 𝑊𝑥 + ℎ 𝑡−1 𝑇 𝑊ℎ + 𝑏ℎ – 𝑦𝑡 = 𝑓 ℎ 𝑡 𝑇 𝑊𝑦 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 103
  101. 101. Simple Recurrent Neural Network • Item of event as 1-of-N coded vector • Example: – Output event 2, item 4 y2 = – Input event 2, item 3 x2 = 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐0 0 0 1 0 0 0 1 0 0 104
  102. 102. Distributed Item Representations: Prod2Vec vi = vector representing item i ik = item at step k of the session • Learn vectors vi in order to maximize the joint probabilities of having items ik-2 , ik-1 , ik+1 , ik+2 given item ik • Sliding window of size N – e.g., N = 5 k–2 k–1 k k+1 k+2 105
  103. 103. Simple Recurrent Neural Network • Fairly effective in modeling sequences – good for session-based problems • Problems – exploding gradients  careful regularization – careful initialization – how to manage attributes of events? – how to manage multiple sessions ? 𝒉 𝟎 𝒉 𝟏 𝒉 𝟐 𝒙 𝟏 𝒙 𝟐 𝒚 𝟏 𝒚 𝟐 106
  104. 104. Multiple sessions • Naïve solution: concatenation • Whole user history as a single sequence • Trivial implementation but limited effectiveness User 1 Session 1 Session 2 Session N … R N N R N N R N N R N N R N N R N N R N N R N N R N N R N N 107
  105. 105. Multiple sessions • Naïve solution: concatenation • Whole user history as a single sequence • Trivial implementation but limited effectiveness  Hierarchical RNN (HRNN) User 1 Session 1 Session 2 Session N … R N N R N N R N N R N N R N N R N N R N N R N N R N N R N N 108
  106. 106. HRNN Architecture User 1 Session 1 Session 2 109
  107. 107. Architecture Session 1 User 1 Within session hidden states: • Initialization: 𝑠1,0 = 0 • Update: 𝑠1,𝑛 = RNNses(𝑖1,𝑛, 𝑠1,𝑛−1) Across session hidden states: • Initialization: 𝑐0 = 0 110
  108. 108. Architecture User 1 Session 1 Across session: • Update: 𝑐 𝑚 = RNNusr 𝑠 𝑚, 𝑐 𝑚−1 previous user-state last session-state 111
  109. 109. Architecture – HRNN Init User 1 Session 1 Session 2 from the 2nd session on: • Initialization: 𝑠 𝑚,0 = f(𝑊𝑖𝑛𝑖𝑡 𝑐 𝑚−1 + 𝑏𝑖𝑛𝑖𝑡) • Update: 𝑠 𝑚,𝑛 = RNNses(𝑖 𝑚,𝑛, 𝑠 𝑚,𝑛−1) 112
  110. 110. Architecture – HRNN All User 1 Session 1 Session 2 from the 2nd session on: • Initialization: 𝑠 𝑚,0 = f(𝑊𝑖𝑛𝑖𝑡 𝑐 𝑚−1 + 𝑏𝑖𝑛𝑖𝑡) • Update: 𝑠 𝑚,𝑛 = RNNses(𝑖 𝑚,𝑛, 𝑠 𝑚,𝑛−1, 𝑐 𝑚−1) 113
  111. 111. Architecture - Complete User 1 Session 1 Session 2 114
  112. 112. Architecture - Complete User 1 Session 1 Session 2 Two identical sessions from different users will produce different recommendations 115
  113. 113. Number of hidden layers • One is enough Add more layers if you have • many sessions or • many interactions per session 116
  114. 114. Including attributes Naïve solutions fail Concatenate item ID and features wID wfeatures feature vector Hidden layer One-hot vector Scores on items f() One-hot vector ItemID (next) ItemID Network output 117
  115. 115. Late-merging -> Parallel RNNs wID wfeatures feature vector hidden layer Subnet output hidden layer One-hot vector Subnet output Weighted output Scores on items f() ItemID One-hot vector ItemID (next) 118
  116. 116. • Straight backpropagation fails  Alternative training methods • Train one subnet at the time and freeze others (like in ALS!) • Depending on the frequency of iteration – Residual training (train fully) – Alternating training (per epoch) – Interleaving training (per mini-batch) Training Parallel RNN 119
  117. 117. Part III: Evaluation and Datasets
  118. 118. Agenda • 13:30 – 15:00 Part 1 – Introduction, Problem definition • 15:00 – 15:30 Coffee break • 15:30 - 16:15 Part 2 – Algorithms • 16:15 – 16:45 Part 3 – Evaluation • 16:45 – 17:00 – Closing, Discussion 121
  119. 119. • Off-line – evaluation metrics • Error metrics: RMSE, MAE • Classification metrics: Precision, Recall, … • Ranking metrics: MAP, MRR, NDCG, … – dataset partitioning (hold-out, leave-one-out, …) – available datasets • On-line evaluation – users studies – field test Traditional evaluation approaches
  120. 120. • Splitting the data into training and test sets event-level session-level Dataset partitioning
  121. 121. • Splitting the data into training and test sets event-level session-level Dataset partitioning Training Test Users Time (events)
  122. 122. • Splitting the data into training and test sets event-level session-level Dataset partitioning Training Test Users Time (events) Users Time (events)
  123. 123. Community Level User-level Dataset partitioning
  124. 124. Community Level User-level Dataset partitioning Training Test Profile Training users Test users Time (events)
  125. 125. Community Level User-level Dataset partitioning Training users Test users Time (events) Training Test Profile Training users Test users Time (events)
  126. 126. Community Level User-level Dataset partitioning Training users Test users Time (events) Training Test Profile Training users Test users Time (events)
  127. 127. • session-based or session-aware task use session-level partitioning – better mimic real systems trained on past sessions • session-based task use session-level + user-level – to test for new users Dataset partitioning
  128. 128. • Fixed – e.g., 80% training / 20% test • Time rolling – Move the time-split forward in time Split size
  129. 129. • Sequence agnostic prediction – e.g., similar to matrix completion Target Items Time (events)
  130. 130. • Sequence agnostic prediction • Given-N next item prediction – e.g., next track prediction Target Items Time (events)
  131. 131. • Sequence agnostic prediction • Given-N next item prediction – e.g., predict a sequence of events Target Items Time (events)
  132. 132. • Sequence agnostic prediction • Given-N next item prediction • Given-N next item prediction with look-ahead Target Items Time (events)
  133. 133. • Sequence agnostic prediction • Given-N next item prediction • Given-N next item prediction with look-ahead • Given-N next item prediction with look-back – e.g., how many events are required to predict a final product Target Items Time (events)
  134. 134. • Sequence agnostic prediction • Given-N next item prediction • Given-N next item prediction with look-ahead • Given-N next item prediction with look-back • In same scenarios, the same item can be consumed many times – e.g, next track recommendation, repeated recommendations, … Target Items
  135. 135. • Classification metrics – Precision, Recall, … – good for context-adaptation (traditional matrix- completion task) • Ranking metrics – MAP, NDCG, MRR, … – good with fixed-sized splits • Multi-metrics – Ranking + Others (Diversity) Evaluation metrics
  136. 136. Datasets 139 Name Domain Users Items Events Sessions Amazon EC 20M 6M 143M Epinions EC 40k 110k 181k Ta-feng EC 32k 24k 829k TMall EC 1k 10k 5k Retailrocket EC 1.4M 235k 2.7M Microsoft WWW 27k 8k 55k 32k MSNBC WWW 1.3M 1k 476k Delicious WWW 8.8k 3.3k 60k 45k CiteULike WWW 53k 1.8k 2.1M 40k Outbrain WWW 700M 560 2B
  137. 137. Datasets 140 Name Domain Users Items Events Sessions AOL Query 650k 17M 2.5M Adressa News 15k 923 2.7M Foursquare_2 POI 225k 22.5M Gowalla_1 POI 54k 367k 4M Gowalla_2 POI 196k 6.4M 30Music Music 40k 5.6M 31M 2.7M
  138. 138. Thank you for your attention Massimo Quadrana, Pandora Media, USA mquadrana@pandora.com Paolo Cremonesi, Politecnico di Milano paolo.cremonesi@polimi.it Dietmar Jannach, AAU Klagenfurt, Austria dietmar.jannach@aau.at

×