Upcoming SlideShare
×

# Recommender Systems

3,501 views
3,319 views

Published on

Published in: Technology, Business
22 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total views
3,501
On SlideShare
0
From Embeds
0
Number of Embeds
31
Actions
Shares
0
0
0
Likes
22
Embeds 0
No embeds

No notes for slide
• Similarity Weights Optimization: also known by the name &quot;Neighborhood modeling through global optimization&quot;. In SWO the similarity function (Pearson, Cosine) is only used to determine the neighbours. The weights for the weighted average are found via an optimization process which minimizes the total prediction error – the weights are the optimized parameter in the error function. The difference between NN CF and SWO (similarity weight optimization) is that in NN CF the similarity function (Pearson, Cosine) is used to both determine the nearest neighbours and determine the weights in the weighted average of the prediction. This technique requires data normalization.
• In some situations the system can be asked for a recommendation tailored for a group of people. For example if a family is sitting together watching TV, the system needs to recommend something that suits the family as a whole. A sports show might be more interesting for the father, but would leave some other members of the family unsatisfied. In some systems the group is dynamic, and the members of the group change over time, which requires constant adjustments on the system&apos;s part. The satisfaction of individuals may be a complex matter since for example if the TV shows makes the children happy, then the mother may also be (indirectly) happy just because her children are happy. In some cases multiple items are recommended to the group, for example in a trip recommender there is time to visit 4 different places within a day&apos;s trip, and different members prefer to visit different locations.[1,2,3].
• ### Recommender Systems

1. 1. Recommender SystemsLior RokachDepartment of Information Systems EngineeringBen-Gurion University of the Negev
2. 2. About Me Prof. Lior Rokach Department of Information Systems Engineering Faculty of Engineering Sciences Head of the Machine Learning Lab Ben-Gurion University of the Negev Email: liorrk@bgu.ac.il http://www.ise.bgu.ac.il/faculty/liorr/ PhD (2004) from Tel Aviv University
3. 3. Are You Being Served? What are you looking for? Demographic – Age, Gender, etc. Context-  Casual/Event  Season  Gift Purchase History  Loyal Customer  What is the customer currently wearing?  Style  Color Social  Friends and Family  Companion
4. 4. Recommender Systems A recommender system (RS) helps people that have not sufficient personal experience or competence to evaluate the, potentially overwhelming, number of alternatives offered by a Web site.  In their simplest form RSs recommend to their users personalized and ranked lists of items  Provide consumers with information to help them decide which items to purchase
5. 5. Example applications
6. 6. What book should I buy?
7. 7. What movie should I watch? • The Internet Movie Database (IMDb) provides information about actors, films, television shows, television stars, video games and production crew personnel. • Owned by Amazon.com since 1998 • 796,328 titles and 2,127,371 people • More than 50M users per month.
8. 8. abcd The Nextflix prize story In October 2006, Netflix announced it would give a \$1 million to whoever created a movie-recommending algorithm 10% better than its own. Within two weeks, the DVD rental company had received 169 submissions, including three that were slightly superior to Cinematch, Netflixs recommendation software After a month, more than a thousand programs had been entered, and the top scorers were almost halfway to the goal But what started out looking simple suddenly got hard. The rate of improvement began to slow. The same three or four teams clogged the top of the leader-board. Progress was almost imperceptible, and people began to say a 10 percent improvement might not be possible. Three years later, on 21st of September 2009, Netflix announced the winner. 30.07.2012
9. 9. What news should I read?
10. 10. Where should I spend my vacation? Tripadvisor.com I would like to escape from this ugly an tedious work life and relax for two weeks in a sunny place. I am fed up with these crowded and noisy places … just the sand and the sea … and some “adventure”. I would like to bring my wife and my children on a holiday … it should not be to expensive. I prefer mountainous places… not too far from home. Children parks, easy paths and good cuisine are a must. I want to experience the contact with a completely different culture. I would like to be fascinated by the people and learn to look at my life in a totally different way.
11. 11. Usage in the market/products Recommendation Procedure SWOTState-of-the-art solutions Methods Summary Model Analysis Examined Solutions Method Commonness Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes AmazonCollaborative Filtering v v v v v v v v v v v vContent-Based Techniques v v v v v v v v v v vKnowledge-Based Techniques v v v v v v vStereotype-Based Recommender Systems v v v v v v vOntologies and Semantic Web Technologies v v vfor Recommender SystemsHybrid Techniques v v v v v v vEnsemble Techniques for Improving v futureRecommendationContext Dependent Recommender Systems v v v v v vConversational/Critiquing Recommender v vSystemsCommunity Based Recommender Systems v v v v vand Recommender Systems 2.0 30.07.2012
12. 12. Selected Methods
13. 13. Recom Next Steps. Procedure SWOTPresenting the Three selected methods Methods Summary Model Analysis  “Customers who bought 1 Collaborative this Item also bought…” Filtering 2 Ensemble  “The wisdom of crowds”  “Tell me the music that 3 Context Based I want to listen NOW" 30.07.2012
14. 14. Recom Next Steps. Procedure SWOTPresenting the Three selected methods Methods Summary Model Analysis 4 Cross Domain  “Can movies and books collaborate?”  "Tell me who your friends are, 5 Community and I will tell you who you are.”  “Can you recommend a movie for 6 Group me and my friends?” 30.07.2012
15. 15. Method 1Collaborative Filtering
16. 16. Method 1 Procedure SWOTCollaborative Filtering Methods Summary Model Analysis CF Ensemble Context  The method of making automatic predictions (filtering) about the interests of a user by collectingDescription taste information from many users (collaborating). The 1 Collaborative Filtering underlying assumption of CF approach is that those who agreed in the past tend to agree again in the future. Selected Techniques  kNN - Nearest Neighbor  SVD – Matrix Factorization  Similarity Weights Optimization (SWO) 30.07.2012
17. 17. Collaborative Filtering Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context abcd The Idea  Trying to predict the opinion the user will have on the different items and be able to recommend the “best” items to each user based on: the user’s previous likings and the opinions of other like minded users Negative Rating ? Positive Rating 30.07.2012
18. 18. Collaborative Filtering Procedure SWOTHow does it work? Methods Summary Model Analysis CF Ensemble Context “People who liked this also abcd abcd liked…” User-to-User  Recommendations are made by finding users with similar tastes. Jane and Tim both liked Item 2 and disliked Item 3; it seems they might have similar taste, which suggests that in general Jane agrees with Tim. This makes Item 1 a good recommendation for Tim. Item This approach does not scale well for to millions of users. Item Item-to-Item  Recommendations are made by finding items that have similar appeal to many users. Tom and Sandra are two users who liked both Item 1 and Item 4. That suggests that, User to in general, people who liked Item 4 will User also like item 1, so Item 1 will be recommended to Tim. This approach is scalable to millions of users and millions of items. 30.07.2012
19. 19. Collaborative Filtering Procedure SWOTRating Matrix Methods Summary Model Analysis CF Ensemble Context abcd Sample of a matrix  The ratings of users and items are represented in a matrix  All CF methods are based on such rating matrix abcd Items abcd Users  TheItems in the system  TheUsers in the system abcd Ratings  Eachitem may have a rating 30.07.2012
20. 20. Collaborative Filtering Procedure SWOTWhat is new? Methods Summary Model Analysis CF Ensemble Context abcd Few words about the techniques  Collaborative filtering is one of the most common recommendation methods in the market today.  Up until two years ago, the kNN (“k” Nearest Neighbor) technique was the norm. SVD (Singular Value Decomposition), which has shown to be successful in the Netflix recommendation competition, became common in the last year. SWO is also a newer technique asking to enhance the veteran kNN.  In the following slides the three techniques will be presented. It is important to get acquainted with the techniques as they will be employed by the Ensemble. 30.07.2012
21. 21. Method 1Collaborative FilteringSelected Techniques Explained
22. 22. Method 1Collaborative FilteringTechnique 1kNN - Nearest Neighbor
23. 23. kNN - Nearest Neighbor Procedure SWOTHigh level explanation Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd k-nearest neighbors algorithm  A method for classifying objects based on closest training examples in the feature space.  It is assumed that similar samples are grouped together  “k” means the number of neighbors – a proximity measure abcd Recommendation example  Finding the most relevant song by comparing to a set of already heard ones. 30.07.2012
24. 24. kNN - Nearest Neighbor Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO Current User Users 1 1st item rate 0 Dislike ? 1 0 1 Like abcd abcd Unknown Rating Prediction abcd Other Users 1  This user did  The prediction not rate the  There are Items ? Unknown 1 was made item. We will other users based on the try to predict who rated the 0 nearest a rating same item. We are interested 1 neighbor. toabcd according Hamming Distance in the Nearest his The Hamming distance is named neighbors. 1  after Richard Hamming. Neighbors. 0  In information theory, the User Model = 1 abcd Hamming distance between two strings of equal length is interactionlooking 1 Nearest Neighbors  We are the number of positions at which the corresponding abcd for the history symbols are different. Nearest 1  Nearest Neighbor. The one with the 1 Neighbor lowest Hamming 0 14th item rate distance. Hamming 5 6 6 5 4 8 distance 30.07.2012
25. 25. Method 1Collaborative FilteringTechnique 2SVD - Singular Value Decomposition
26. 26. SVD - Singular Value Decomposition Procedure SWOTMatrix factorization technique Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd abcd SVD sample matrix  SVD is extraordinarily useful and has many applications such as data analysis, signal processing, pattern recognition, image compression, weather prediction, and Latent Semantic Analysis or LSA  Probably most popular model among Netflix contestants.  Has become the Collaborative Filtering standard  The Singular Value Decomposition (SVD) is a widely used technique to decompose a matrix into several component matrices, exposing many of the useful and interesting properties of the original matrix. 30.07.2012
27. 27. SVD - Singular Value Decomposition Procedure SWOTMatrix factorization technique Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd abcd SVD sample matrix  In the Recommendation Systems field, SVD models users and items as vectors of latent features which when cross product produce the rating for the user of the item  With SVD a matrix is factored into a series of linear approximations that expose the underlying structure of the matrix.  The goal is to uncover latent features that explain observed ratings 30.07.2012
28. 28. Latent Factor Models Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO Users & Ratings Latent Concepts or Factors abcd Hidden Concept  SVDreveals hidden connections and its strength abcdVD S  SVD Process abcd Revealed Concept abcd SVD  Malethat like watching  User Rating serious Movies 30.07.2012
29. 29. Latent Factor Models Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO Users & Ratings Latent Concepts or Factors abcd Recommendation  SVD revealed a movie this user might like! 30.07.2012
30. 30. Latent Factor Models Procedure SWOTConcept space Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO 30.07.2012
31. 31. Method 1Collaborative FilteringTechnique 3SWO - Similarity Weights Optimization
32. 32. Similarity Weights Optimization Procedure SWOTSWO vs. Nearest Neighbor Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd abcd SWO kNN  The similarity function  the similarity function (Pearson, Cosine) is used (Pearson, Cosine) is used to determine the for both: neighbors.  Determining the nearest  The weights for the neighbors. weighted average are  Determining the weights in found via an optimization the weighted average of process which minimizes the prediction. the total prediction error. 30.07.2012
33. 33. Similarity Weights Optimization Procedure SWOTData Normalization Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd Data Normalization  Need to identify relations and mix ratings across items/users  However, User and item-specific variability masks fundamental relationships  Examples:  Some items are systematically rated higher  Some items were rated by users that tend to rate low  Ratings change along time  Normalization is critical to the success of a kNN approach 30.07.2012
34. 34. Similarity Weights OptimizationData Normalization Procedure SWOT Methods Summary Model Analysis CF Ensemble Context kNN SVD SWO abcd Data Normalization  Remove data characteristics that are unlikely to be explained by kNN  Common practice is to use centering: Remove user- and item-means  A more comprehensive approach eliminates additional interfering variability such as time effects  Here, we normalize by removing the baseline estimates 30.07.2012
35. 35. Similarity Weights Optimization Procedure SWOTNeighborhood modeling through global optimization Model CF Methods Ensemble Analysis Summary Context kNN SVD SWO abcd A basic model 30.07.2012
36. 36. Method 2Ensemble
37. 37. Method 2 Procedure SWOTEnsemble Methods Summary Model Analysis CF Ensemble Context  Ensemble methodology imitatesDescription the human nature to seek advice before making any crucial 2 Ensemble decision.  “Two heads are better than one”.  Bagging (Breiman, 1996) Selected Techniques  AdaBoost (Freund and Schapire, 1996)  Random Parameter Manipulation  The innovation is adopting the Ensemble concept from the general machine learning field to the Recommender System domain. 30.07.2012
38. 38. Ensemble at 30,000 feet Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context abcd Overview  When important decisions have to be made, society often places its trust in groups of people. We have parliaments, juries, committees, and boards of directors, whom we are happy to have make decisions for us.  Ensemble imitates the human nature to seek advice before making any crucial decision. It is achieved by weighing the individual opinions, and combining them before reaching a final decision, hence the names “The Wisdom of Crowds” and “Committee of Experts”.  We can ensure that the ensemble will produce results that are in the worst case as bad as the worst classifier in the ensemble. 30.07.2012
39. 39. Ensemble Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context abcd What is it?  If you think about it, Ensemble is not a question to be answered.  So what is it than?  Ensemble is the answer.  So what is the question?  How to improve results! 30.07.2012
40. 40. EnsembleImproving result… Procedure SWOT Methods Summary Model Analysis CF Ensemble Context abcd abcd Why do we care? Because...  Having improved results will prevent cases like this. 30.07.2012
41. 41. Ensemble Procedure SWOTA short story Methods Summary Model Analysis CF Ensemble Context abcd Francis Galton  Galton promoted statistics and invented the concept of correlation.  In 1906 Galton visited a livestock fair and stumbled upon an intriguing contest.  An ox was on display, and the villagers were invited to guess the animals weight.  Nearly 800 gave it a go and, not surprisingly, not one hit the exact mark: 1,198 pounds.  Astonishingly, however, the average of those 800 guesses came close - very close indeed. It was 1,197 pounds. 30.07.2012
42. 42. Ensemble Procedure SWOTDoes it always work? Methods Summary Model Analysis CF Ensemble Context abcd abcd Does Ensemble always work? No  Not all crowds (groups) are wise.  For example, crazed investors in a stock market bubble. 30.07.2012
43. 43. Ensemble Procedure SWOTSchematic Example Methods Summary Model Analysis CF Ensemble Context abcd Recommender 1 abcd Recommender 2 abcd Recommender 3 abcd Weak Learners  And they all abcd may be just Problem Example weak  Linear learners. recommenders cannot solve non- linearly separable abcd Combined Recommender problems  however, their combination can 30.07.2012
44. 44. EnsembleWhy using Ensembles? Procedure SWOT Methods Summary Model Analysis CF Ensemble Context Statistical Reasons, Risk reduction Computational Reasons  Out of many recommender models  Every time we run a with similar training / test errors, recommendation algorithm, we may which one shall we pick? If we just find different local optima. pick one at random, we risk the possibility of choosing a really  Combining their outputs may allow poor one us to find a solution that is closer  Combining / averaging them may to the global minimum. prevent us from making one such unfortunate decision Too little data / too much data Representational Reasons  Generating multiple recommenders  The recommender space may not with the re-sampling of the contain the solution to a given available data / mutually exclusive particular problem. However, an subsets of the available data. ensemble of such recommenders may. 30.07.2012
45. 45. Ensemble Procedure SWOTThe Diversity Paradox Methods Summary Model Analysis CF Ensemble Context abcd abcd Diversity vs. Accuracy Description  On one hand we expect the ensemble members to be as good as possible so they all target the same goal.  On the other hand they have to be independent, which means different, hence, lowering the accuracy. abcd There’s no real Paradox…  Ideally, all committee members would be right about everything!  If not, they should be wrong about different things. 30.07.2012
46. 46. Ensemble Procedure SWOTSingle–model Ensemble RS Methods Summary Model Analysis CF Ensemble Context abcd Example configuration abcd 4 Step abcd 2 Step  Produce several abcd 5 Step  Generate recommendatio different ns  Combinethe variations of different the same input recommendations Rating RS 1 Matrix 1 Training Rating Inducer Ensemble ratings Matrix RS Rating abcd 1 Step RS M Matrix M abcdtep 6 S abcd 3 Step Users& Items  Theactual CF  Generates more ratings Method & accurate predictions input Technique than each individual RS 30.07.2012
47. 47. Netflix Prize Procedure SWOTThe Competition Methods Summary Model Analysis CF Ensemble Context abcd The Nextflix prize story  In October 2006, Netflix announced it would give a \$1 million to whoever created a movie-recommending algorithm 10% better than its own.  Within two weeks, the DVD rental company had received 169 submissions, including three that were slightly superior to Cinematch, Netflixs recommendation software  After a month, more than a thousand programs had been entered, and the top scorers were almost halfway to the goal  But what started out looking simple suddenly got hard. The rate of improvement began to slow. The same three or four teams clogged the top of the leader-board.  Progress was almost imperceptible, and people began to say a 10 percent improvement might not be possible.  Three years later, on 21st of September 2009, Netflix announced the winner. 30.07.2012
48. 48. Netflix Prize Procedure SWOTThe winner team used an Ensemble Methods Summary Model Analysis CF Ensemble Context abcdFACT Actually, the top 100 solutions were Ensemble based 30.07.2012
49. 49. Netflix PrizeAnd the winner is… Procedure SWOT Methods Summary Model Analysis CF Ensemble Context abcd abcd We have a winner! So why bother?  You may ask yourself, why do we need to further research & develop the Ensemble?  Because it was solved in a manual tailored way, combining a set of predefined methods.  There is plenty of room for improvements. 30.07.2012
50. 50. Netflix Prize Procedure Methods SWOT SummaryThe real winner Model Analysis CF Ensemble Context abcd The real winner is the method!  One could say that the Ensemble techniques and methods helped tip the scales.  While the algorithms and good knowledge of statistics goes a long way, it was ultimately the cross-team collaboration that ended the contest.  It is easy to overlook the fact that many teams were actually committees of experts by themselves.  "The Ensemble" team, appropriately named for the technique they used to merge their results consists of over 30 people.  Likewise, the winning team is a collaborative effort of several distinct groups that merged their results. 30.07.2012
51. 51. Method 2EnsembleSelected Techniques Explained
52. 52. Method 2EnsembleTechnique 1Bagging (Breiman, 1996)
53. 53. Bagging Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Overview  Introduced by Breiman (1996)  “Bagging” stands for “bootstrap aggregating”.  It is an ensemble method  a method of combining multiple predictors.  The intuition is that by using only part of the data and making some data (randomly) have more impact, you get a better variety of models that will reduce over fitting 30.07.2012
54. 54. Bagging-based sampling of rating matrix Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Bagging in action abcd Step 1 Arandom subset of the training set is taken. 30.07.2012
55. 55. Bagging-based sampling of rating matrix Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Bagging in action 30.07.2012
56. 56. Bagging-based sampling of rating matrix Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Bagging in action abcd 2 Step  Some of the data in this subset is duplicated several times. 30.07.2012
57. 57. Bagging-based sampling of rating matrix Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Bagging in action abcd From here to a recommendation  The input set is given to one of the recommendation methods.  It is repeated until every method has an input set.  The average result (or most common one) is picked. 30.07.2012
58. 58. Method 2EnsembleTechnique 2AdaBoost (Freund and Schapire, 1996)
59. 59. AdaBoost Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Overview  Introduced by Freund and Schapire, 1996  “AadBoost” stands for “Adaptive Boosting”.  Boosting - To boost a “weak” learning algorithm into a “strong” learning algorithm  It is an ensemble method  Training samples are weighted differently across the ensemble members 30.07.2012
60. 60. AdaBoost Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd abcd Overview The Process  We start with building an initial model.  Next that model is improved, by modifying the input (training) set to emphasize (for example by duplicating) the part of the input where the model was less accurate.  The model is rebuilt and checked for its accuracy.  The process repeats until the error of the model is lower than some bound. 30.07.2012
61. 61. AdaBoost Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Step 1 abcd Step 2 We start with Next that model is building an improved, by abcd Step initial model. Final modifying the input set abcd 3 Step to emphasize the part process The repeats until of the input where the The model ismodel was less the error of rebuilt and accurate. Training checked for its the model is Combined lower than accuracy. some bound. recommender 30.07.2012
62. 62. Method 2EnsembleTechnique 3Random Parameter Manipulation
63. 63. Random Parameter Manipulation Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Overview  The idea is to have multiple variations of the same recommendation technique  The variations are formed by changing the input parameters systematically  The Ensemble is achieved by combining the modified recommenders in order to produce a unified prediction 30.07.2012
64. 64. Random Parameter Manipulation Procedure SWOTSchematic example Methods Summary Model Analysis CF Ensemble Context Bagging AdaBoost RPM abcd Example: Averaging multiple SVD matrix based on different values of F abcd Variations of SVD  Different F values, 3 to 5 abcd Ensemble  Combined Recommenders 30.07.2012
65. 65. Method 2EnsembleTesting coverage
66. 66. Ensemble Procedure SWOTTesting coverage Methods Summary Model Analysis CF Ensemble Context abcd abcd Coverage Details  Each of the three CF techniques will be tested with an ensemble technique  There are 9 possible combinations of techniques.  The diagram is color coded for convenience. 30.07.2012
67. 67. Method 3Context-Based recommendation
68. 68. Method 3 Procedure SWOTContext-Based Methods Summary Model Analysis CF Ensemble Context  Adapting the recommendations toDescription the specific user context.  “Tell me the music that I want to 3 Context-Based listen NOW“. Selected Techniques  Item Split  Linear Models 30.07.2012
69. 69. Context-Based Recommender Systems Procedure SWOTOverview Methods Summary Model Analysis CF Ensemble Context abcd Overview  The recommender system uses additional data about the context of an item consumption.  For example, in the case of a restaurant the time or the location may be used to improve the recommendation compared to what could be performed without this additional source of information.  A restaurant recommendation for a Saturday evening when you go with your spouse should be different than a restaurant recommendation on a workday afternoon when you go with co-workers 30.07.2012
70. 70. Context-Based Recommender Systems Procedure SWOTMotivation Methods Summary Model Analysis CF Ensemble Context Motivating Examples  Recommend a vacation  Winter vs. summer  Recommend a purchase (e-retailer)  Gift vs. for yourself  Recommend a movie  To a student who wants to see it on Saturday night with his girlfriend in a movie theater. 30.07.2012
71. 71. Context-Based Recommender Systems Procedure SWOTMotivation Methods Summary Model Analysis CF Ensemble Context Motivating Examples  Recommend music  The music that we like to hear is greatly affected by a context, such that can be thought of a mixture of our feelings (mood) and the situation or location (the theme) we associate it with.  Listen to Bruce Springteen "Born in USA" while driving along the 101.  Listening to Mozarts Magic Flute while walking in Salzburg. 30.07.2012
72. 72. Information Discovery: Example“Tell me the music that I want to listen NOW" Procedure SWOT Methods Summary Model Analysis CF Ensemble Context abcd abcd Musicovery.com Details  An Interactive personalized WebRadio  A mood matrix propose a relationship between music and mood.  20 genres and time periods, a popularity scale (hits, less known songs/discovery).  covers all musical genres, rap to funk via electro, rock, disco… or classical.  Ethnographic studies have shown that people choose music peaces according to their mood or mood change expectation.  Musicovery relied on this principle to build an effective relationship between music and emotion. 30.07.2012
73. 73. Context-Based Recommender Systems Procedure SWOTContext vs. others Methods Summary Model Analysis CF Ensemble Context What simple recommendation techniques ignore?  What is the user when asking for a recommendation?  Where (and when) the user is ?  What does the user (e.g., improve his knowledge or really buy a product)?  Is the user or with other ?  Are there products to choose or only ?  Is the word economy or ? 30.07.2012
74. 74. Context-Based Recommender SystemsContext vs. others Procedure SWOT Methods Summary Model Analysis CF Ensemble Context What simple recommendation techniques ignore?  What is the user when asking for a recommendation?  Where (and when) the user is ?  What does the user (e.g., improve his knowledge or really buy a product)?  Is the user or with other ?  Are there products to choose or only ?  Is the word economy or ? Plain recommendation technologies forget to take into account the user context. 30.07.2012
75. 75. Context-Based Recommender Systems Procedure SWOTFoundations Methods Summary Model Analysis CF Ensemble Context abcd Contextual Computing  Contextual computing refers to the enhancement of a user’s interactions by understanding the user, the context, and the applications and information being used, typically across a wide set of user goals  Actively adapting the computational environment - for each and every user - at each point of computation  Contextual computing approach focuses on understanding the information consumption patterns of each user  Contextual computing focuses on the process not only on the output of the search process. [Pitkow et al., 2002] 30.07.2012
76. 76. Context-Based Recommender Systems Procedure SWOTMajor obstacles Methods Summary Model Analysis CF Ensemble Context abcd Major obstacle for contextual computing  Obtain sufficient and reliable data describing the user context  Selecting the right information, i.e., relevant in a particular personalization task  Understand the impact of contextual dimensions on the personalization process  Computational model the contextual dimension in a more classical recommendation technology  For instance: how to extend Collaborative Filtering to include contextual dimensions? 30.07.2012
77. 77. Method 3Context-Based recommendationSelected Techniques ExplainedItem Split
78. 78. Context-Based Recommender Systems Procedure SWOTItem Split approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Item Split - Intuition and Approach  The same item in different contextual conditions may produce a different user experience  We consider the same item in different contexts as distinct items  Research goal: Provide better music recommendations. Improve Collaborative Filtering accuracy when the user context is known. 30.07.2012
79. 79. Context-Based Recommender Systems Procedure SWOTCollaborative Filtering Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Context in Collaborative Filtering  “Context is any information that can be used to characterize the situation of an entity” [A.K.Dey, 2001]  In Item Splitting approach - similarly to [Adomavicius et. al, 2005] - we model the context with a set of dynamic features of the rating – representing conditions that can rapidly change their state  When a user evaluates an item, the rating is recoded together with the current state of the contextual variables  CF does not provide a direct method to integrate additional information into the recommendation process 30.07.2012
80. 80. Context-Based Recommender Systems Procedure SWOTReduction-Based Approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Reduction-Based Approach  Reduce the problem of multi-dimensional recommendation to the traditional two-dimensional User x Item  For each “value” of the contextual dimension(s) estimate the missing ratings with a traditional method abcd Example  R: U x I x T  [0,1] U {?} ; User, Item, Time  RD(u, i, t) = RD[T=t](u, i)  The context-dependent estimation for (u, i, t) is computed using a traditional approach, in a two-dimensional setting, but using only the ratings that have T=t. 30.07.2012
81. 81. Context-Based Recommender Systems Procedure SWOTReduction-Based Approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear ModelsMultidimensional Model Bi-dimensional Model item We use only the slice for T=t user User ratings features abcd From here  Theidea is Product to reduce features the problem abcdhere To  Into a manageable model 30.07.2012
82. 82. Context-Based Recommender Systems Procedure SWOTReduction-Based vs. Item splitting Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models Reduction Based Item splitting  Uses cross-validation as  Uses external impurity goodness of segmentation – measures Expensive (i.e. IG) - Heuristic based  Segments are the same for  Each item is tested for a split all the items separately  Prediction is made using only  Prediction is made using all the relevant segment the information, including split items Bottom Line  The best known method (Reduction Based) is difficult to apply (need to search in a huge space of contextual sectors).  We are proposing a more adaptive, and computationally efficient approach. 30.07.2012
83. 83. Context-Based Recommender Systems Procedure SWOTItem Split technique Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Item Split - Intuition and Approach  Each item in the data base ( ) is a candidate for splitting  Context defines ( ) all possible splits of an item ratings vector  We test all the possible splits – we do not have many contextual features  We choose one split (using a single contextual feature) that maximizes an impurity measure and whose impurity is higher than a threshold 30.07.2012
84. 84. Method 3Context-Based recommendationSelected Techniques ExplainedLinear Models
85. 85. Context-Based Recommender Systems Procedure SWOTContextual Modelling approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Overview  In these approaches the context data are explicitly used in the prediction model.  There are several possibilities for using the contextual data.  For instance the context can be used to extend the definition of the distance function in nearest neighbours approaches  The distance function must now also include a "context distance" aspect in it in addition to the user distance (CF) or item distance (CB). 30.07.2012
86. 86. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Overview  Presents an extension of the Matrix Factorization (MF) rating prediction technique that incorporates contextual information to adapt the recommendation to the user target context.  In this approach one model parameter was introduced for each contextual factor and music track genre pair.  This allowed learning how the context affects the ratings and how they deviate from the classical personalized prediction. 30.07.2012
87. 87. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Example  standard rating prediction for a user u and item i that can be computed by a standard matrix factorization method for collaborative filtering, this is the simple predicted rating for this user and item pair, namely 4.24. 30.07.2012
88. 88. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Example  The model that we have used in addition to that estimates context-aware predictions, i.e., predictions were a context is specified:  in the figure we have two contexts c1 and c2 (sun and rain). 30.07.2012
89. 89. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Example  The model makes these two context aware rating predictions (4.94 and 3.84) by estimating on the available data two additional parameters that models the influence of the context on the item, bic1 and bic2  These two parameters describe the modifications to be made to the non context-aware prediction to take into account the context.In the first case the predicted rating must be increased by 0.7 and in the second case decreased by 0.4. 30.07.2012
90. 90. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Predictive Model  Context Aware Collaborative Filtering 30.07.2012
91. 91. Context-Based Recommender Systems Procedure SWOTLinear Models approach Methods Summary Model Analysis CF Ensemble Context Item Split Linear Models abcd Comparison performance of Mean Absolute Error  The largest improvement with respect to the non-personalized model based on the item average is achieved as expected, by personalizing the recommendations (“MF CF"), This gives an improvement of 5%.  The personalized model can be further improved by contextualization (“MF CF + Context") producing an improvement of 7% with respect to the item average prediction, and a 3% improvement over the personalized model.  The modeling approach and the rating acquisition process can substantially improve the rating prediction accuracy when taking into account the contextual information. 30.07.2012
92. 92. Method 4Cross Domain
93. 93. Method 4 Procedure SWOTCross Domain Methods Summary Model Analysis Cross Domain Community Group  Cross-domain recommenders can recommend products and services of several domains that share resourcesDescription (e.g., users, items, ratings, features, late nt patterns s, features, latent patterns). 4 Cross Domain  Knowledge from one or several domains might be utilized in another domain to improve recommendations. Selected Techniques  User-model mediation and aggregation 30.07.2012
94. 94. Cross-Domain Procedure SWOTOverview Methods Summary Model Analysis Cross Domain Community Group abcd Overview  The majority of recommender systems (RS) work in a single domain, such as movies, books, tourism etc.  However, human preferences may span across multiple domains.  Knowledge of a user’s behavior in different domains might improve prediction in a specific domain.  A company might have knowledge of a user in one or more different domains than the target recommendation and would like to use it 30.07.2012
95. 95. Cross-Domain Procedure SWOTOverview Methods Summary Model Analysis Cross Domain Community Group abcd Motivation  Sparsity and cold-start problems: cross-domain algorithms may enrich the training data with data from other domains to prevent sparsity.  User friendly systems: by making use of data that was collected for one domain in other domains, systems can prevent user’s interfering for providing feedback.  Availability of cross domain data: many e-commerce systems and social networks contain information of users preferences in several domains. Thus, cross-domain information is available, and it is motivating to look for effective algorithm that can make use of this data to improve recommender systems performance (e.g., x-loads domains).  Marketing – cross-selling of new products: Marketing studies found out that it is effective to promote products from different domains to a user if they fit her buying patterns across domains. 30.07.2012
96. 96. Cross-Domain Procedure SWOTOverview Methods Summary Model Analysis Cross Domain Community Group abcd State of the art techniques  User-model mediation and aggregation  This technique was suggested by (Berkovsy et al, 2006,2007,2008).  Aims at the sparsity challenge of recommender systems by enriching the UM with data from a remote system.  Requires overlap of users between domains  Evaluation was performed for sub-domains of the same domain  Content-based unified user-model  (Gahni and Fano 2002) proposed generating a content-based user model that can be used across domains.  Extracting semantic features that might be relevant for many domains and are pre- defined by domain experts (e.g., trendiness vs. individualism)  Not implemented or evaluated 30.07.2012
97. 97. Cross-Domain Procedure SWOTOverview Methods Summary Model Analysis Cross Domain Community Group abcd State of the art techniques  Transfer learning (TL)  A relatively young research area (since 1995) in Machine learning  Aims at extracting knowledge that was learned for one task in a domain and use it for a target task in a different domain.  TL technique is recently gaining attention for application where datasets are available only for specific domains 30.07.2012
98. 98. Method 4Cross Domain recommendationSelected Techniques ExplainedUser-model mediation and aggregation
99. 99. Cross-Domain Procedure SWOT Methods Summary Model AnalysisUser-model Mediation and Aggregation Cross Domain Aggregation Community CBT` Group abcd Intuition and Approach  This technique was suggested by Berkovsy et al., (2006, 2007, 2008) and aims at the sparsity challenge of recommender systems by enriching the UM with data from a remote (source) system.  The suggested technique was demonstrated for the collaborative filtering approach and is based on mediating user model data form other domains to enrich the users model.  A similar approach was presented by (Gonzales et al., 2006) that generate a unified UM approach that aggregates features from different domains, and maps the features that are aggregated to relevant domains 30.07.2012
100. 100. Cross-Domain Procedure SWOT Methods Summary Model AnalysisUser-model Mediation and Aggregation Cross Domain Aggregation Community CBT` Group abcd Intuition and Approach  Application of the mediation suggested above by Berkovsky at al., requires:  Overlapping users – mediation enriches the data about a specific user with data about the same user from another domain (for other items, and may be also in another context)  Same prediction task – mediation of data from other users models were applied from system that implemented the same prediction function (collaborative filtering), thus employing the same UM (users ratings on items).  Similarity between domains. A method to identify such similarity is needed. Similarity should be integrated in the recommender algorithm. 30.07.2012