• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
CARR’13 - Semantically-Enhanced Pre-Filtering for CARS

CARR’13 - Semantically-Enhanced Pre-Filtering for CARS






Total Views
Views on SlideShare
Embed Views



1 Embed 4

https://twitter.com 4



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • This work has been done with the collaboration of Francesco Ricci from the free university of bolzano and of Luigi Ceccaroni from Bdigital a technology centre in Barcelona
  • CARS are commonly classified in 3 main paradigms depending on how they exploit context. This picture shows the general process of each one. Prefiltering (most popular approach because it has a clear justification) Postfiltering (adjust recommendations) Contextual Modeling (context is incorporated directly in the model, multidimensional approaches) Common limitation of CARS (their need for a large data sets…), We present a novel solution to mitigate this limitation for pre-filtering, because this paraditm…
  • The reason why pre-filtering is very prone to sparsity-related problems. The reason is due to the fact that, in traditional prefiltering… This picture tries to illustrate how the traditional pre-filtering works. This method is too rigid and only works well when the target context has a large number of ratings.
  • Main idea of our solution. In this work, we have proposed a novel approach for pre-filtering that is based on the intuition that ratings acquired in contexts that have a similar effect on the users rating behavior as the target one can also be used for making recommendations. Following with the same example (assuming that family and weekend are semantically similar because they effect…) (in our approach, we may also use ratings acquired in those contexts)
  • Next, I will explain our approach in more detail. Firstly, I will talk about how we acquire the semantic similarities among conditions Then, I will describe how we exploit these similarities during pre-filtering And Finally, I will show the experiments we did to validate our approach
  • So first I’m going to talk about our method for semantics acquisition that is based on implicit semantics and the VSM
  • Definition of implicit semantics (also known as distributional semantics) The Implicit semantic similarity between two concepts is based on the distributional hypothesis which claims that… In IR usages are defined by portions of textual data that can be defined at different granularities
  • VSM is used to find the usage overlap between two concepts (this matrix shows an example of such a representation) When usages are text-based, each entry stores the frequency or a frequency-based weight of the concept . (implicit similarity is obtained by comparing the frequency vectors between two concepts) Show the example (wine vs glass good overlap; wine vs spoon bad overlap)
  • We want to capture the usage of a condition in terms of how it influences the user’s rating behavior. (item-centered measure, usages = items; entries = weights indicating the influence…) Show example (good overlap between sunny and family; bad overlap with rainy) We can find unobvious domain-specific semantic relations (sunny and family).
  • Now I’d like to move on to the next point where I will describe how we actually exploit these semantic similarities between conditions to decide if a context is similar enough to the target one to be considered during ratings filtering.
  • The context of a rating can be defined by a single condition or by the conjunction of multiple conditions. Considering this, there are two main scenarios when comparing two contexts: 1-both are single conditions (show example); 2- at leas one has multiple (pairwise comparison of the similarities among conditions is needed, show example)
  • To decide if two contexts are similar enough, we use a convenient similarity threshold Explain how this threshold can be obtain from training data (at different granularities; we proposed two methods; Depending on the method used, given a target context, different situations can be considered as similar enough. ) This example tries to illustrate this. The table on the left… and the table on the right shows… For instance, if the target context is c1, using…
  • Remember how context is used in prefiltering paradigm (this graphic shows this 3-step process…) This pseudo-code summarizes the proposed semantic approach for rating filtering. (in addition to… as in traditional approach, it also considers…; when having several ratings for the same item and user we compute the average) As 2D prediction model we used… This equation shows…
  • Now, let’s talk about the experiments we did to validate the proposed semantic pre-filtering approach.
  • We have used 2 real world data sets of rating with contextual information One is about an in-car music recommender (this picture shows a screenshot of the interface used for rating collection; a single per rating
  • The other data set we used is of a recommender of places of interest in the Bolzano surroundings. This screenshot shows the interface for rating collection used in this cased that is very similar to previous one. As you can see, using this interface the data sets have multiple ratings for the same item and user. In this example a user is asked to rate a caste in Bolzano without and in specific conditions like sad and happy
  • For the evaluation we compared the performance… The exact pre-filtering uses as 2D prediction model the same bias-based MF model as in the proposed semantic approach Explain the MD MF model proposed by Baltrunas et al. (consist of a standard bias-based MF model extended with additional contextual biases)
  • Explain bar chart We can see that the proposed semantic pre-filtering method using the global threshold, called semantic-Pref-gt, is the most accurate in both data sets. The other variant, using the different threshold per context, is slightly better than the traditional approach in terms of MAE but not for RMSE. Main Conclusion: These preliminary results, demonstrate that on the one hand, the proposed semantic pre-filtering effectively mitigates the sparsity-related problems of the exact pre-filtering, and that on the other hand, the proposed… can be a good alternative to the multidimensional approaches
  • Of course, this work can be improved in several ways. Currently we are investigating other variants of the proposed semantic pre-filtering (pairwise similarity strategies) User-center instead of item-centered How to exploit this implicit semantic similarities in multidimensional approaches Finally, extend the evaluation of the proposed semantic approach for ranking recommendation
  • If you have any comments or questions, I'll be happy to hear them

CARR’13 - Semantically-Enhanced Pre-Filtering for CARS CARR’13 - Semantically-Enhanced Pre-Filtering for CARS Presentation Transcript

  • Semantically-Enhanced Pre-Filtering for Context-Aware Recommender Systems CaRR 2013 Victor Codina Francesco Ricci Luigi Ceccaroni
  • Main paradigms for exploitingcontextual information in CARS [Adomavicius & Tuzhilin, 2010] 2
  • Pre-filtering is very prone to suffer fromsparsity-related problems In traditional pre-filtering only the ratings acquired in exactly the same context of the target user are used An example (places of interest recommender): target context = sunny Only this set of ratings would be used family sunny Insight: Perhaps we can also use ratings acquired in similar contexts… rainy weekend 3
  • Our solution is based on using also ratingsacquired in semantically similar contexts We consider two contexts as semantically similar if they have a similar effect on the users rating behavior Same example: We know that sunny, family and weekend influenceTarget context = sunny positively ratings for outdoor places (e.g. natural parks) family sunny weekend rainy 4
  • OutlineSemantics acquisitionSemantics exploitationExperiments 5
  • Outline Implicit semanticsSemantics acquisition Vector Space Model Similarity calculationSemantics exploitationExperiments 6
  • The implicit semantic similarity is based onthe distributional hypothesis In implicit semantics the meaning of a concept is captured by its usage distributional hypothesis: “concepts that share similar usages share similar meaning” In Information Retrieval (IR) usages are portions of textual data: • document • paragraph • sentence 7
  • The Vector Space Model (VSM) is normallyused to find the implicit semantic similarity Two concepts are similar if their usages overlap An example of concept-usage matrix in IR (WordSpace) usage = sentence (s1) frequency-based weight Concept s1 s2 s3 s4 s5 s6 s7 glass 1 1 0 1 0 2 0 wine 2 1 0 0 1 2 0 spoon 0 1 1 1 0 0 2 8
  • Two conditions are semantically similar ifthey influence the user’s ratings similarly We measure the influence of a condition from a item-centered perspective usage = item Real value indicating the influence of a condition on (e.g. a museum) the item’s ratings (positive, neutral or negative) Contextual Natural Natural Natural Walking Museum Museum Condition Park 1 Park 2 Park 3 Route 1 1 2 sunny - - family - - rainy 9
  • OutlineSemantics acquisition Comparing contextsSemantics exploitation Pre-filtering algorithmExperiments 10
  • Similarities among contexts are calculatedfrom the similarities of their conditions A context can be defined by a single or multiple conditions Two main scenarios when comparing two contexts: 1. Both contexts (c1,Values c1 a singleValues c2 Factors c2) have condition Sim(c1,c2) = sim1 Weather unknown sunny sim1 Mood happy unknown Day unknown unknown week 2. At Factors least one of the contexts has multiple c2 Values c1 Values conditions Sim(c1,c2) = Weather unknown sunny pairwise combination Mood happy sad Day weekend unknown week 11
  • A similarity threshold is used to decide iftwo contexts are semantically similar This threshold can be learnt at different granularities We have experimented with two methods for learning the optimal threshold (β) from training data: Using a global threshold for all the possible target contexts Using a different threshold per target context Similar contexts Sim(c1,c3) Global Different threshold c1 c2 c3 (β=0.5) (β1=0.7 β2=0.5 β3=0.3) c1 1 0.6 0.4 c1 c2 None Target c2 0.7 1 0.2 c2 c1 c1 context c3 0.4 0.1 1 c3 None c1 12
  • Our pre-filtering approach uses also ratingswhose context is similar to the target one FOR EACH (user u, item i) in Y DO Y target context IF exists THEN (c*) ADD to X ELSE Ratings filtering GET all where Sim(c,c*) ≥ β ADD rating average to X X END IF END FOR2D prediction model Matrix Factorization (MF) model learnt using learning Stochastic Gradient Descent Rating predictions Overall User Item rating average bias bias 13
  • OutlineSemantics acquisitionSemantics exploitation Data setsExperiments Considered prediction models Experimental results 14
  • For the experimentation we used two real-world contextually-tagged data sets One data set is about an In-Car music recommender Rating in context Context is defined by a single condition 15
  • For the experimentation we used two real-world contextually-tagged data sets The other data set is about a tourism recommender Multiple ratings for the same item and user but in different conditions: without context if “sad” if “happy” by “public transport” 16
  • We compared the performance of ourapproach with two other approaches The 2 proposed variants of the semantic pre-filtering: Sem-Pref-gt – using a global threshold Sem-Pref-dt – using a different threshold per context A traditional approach based on exact pre-filtering: Exact-Pref: exact pre-filtering using the same MF model A state-of-the-art context-aware MF approach: CAMF-CC: multi-dimensional MF model Bias of condition on the item’s type 17
  • The semantic pre-filtering using a globalthreshold (Sem-Pref-gt) is the most accurate Accuracy variation with respect to non-contextual MF (higher values are better) RMSE MAE 18
  • Future workResearch in progress… New variants of the semantic pre-filtering approach New methods for measuring the implicit semantics of contextual conditionsPlanned in the near future: Novel semantic multidimensional approaches Extend the evaluation by assessing the performance of the proposed approaches for ranking recommendation 19
  • Semantically-Enhanced Pre-Filtering for Context-Aware Recommender Systems Any comments or questions? Victor Codina vcodina@lsi.upc.edu