2. Main paradigms for exploiting
contextual information in CARS
[Adomavicius & Tuzhilin, 2010]
2
3. Pre-filtering is very prone to suffer from
sparsity-related problems
In traditional pre-filtering only the ratings acquired in
exactly the same context of the target user are used
An example (places of interest recommender):
target context = sunny
Only this set of ratings would be used
family
sunny Insight: Perhaps we can also use ratings
acquired in similar contexts…
rainy weekend
3
4. Our solution is based on using also ratings
acquired in semantically similar contexts
We consider two contexts as semantically similar if they
have a similar effect on the users rating behavior
Same example: We know that sunny, family
and weekend influence
Target context = sunny positively ratings for outdoor
places (e.g. natural parks)
family
sunny
weekend
rainy
4
6. Outline
Implicit semantics
Semantics acquisition Vector Space Model
Similarity calculation
Semantics exploitation
Experiments
6
7. The implicit semantic similarity is based on
the distributional hypothesis
In implicit semantics the meaning of a concept is
captured by its usage
distributional hypothesis:
“concepts that share similar usages share similar meaning”
In Information Retrieval (IR) usages are portions of
textual data:
• document
• paragraph
• sentence
7
8. The Vector Space Model (VSM) is normally
used to find the implicit semantic similarity
Two concepts are similar if their usages overlap
An example of concept-usage matrix in IR (WordSpace)
usage = sentence (s1) frequency-based weight
Concept s1 s2 s3 s4 s5 s6 s7
glass 1 1 0 1 0 2 0
wine 2 1 0 0 1 2 0
spoon
0 1 1 1 0 0 2
8
9. Two conditions are semantically similar if
they influence the user’s ratings similarly
We measure the influence of a condition from a
item-centered perspective
usage = item Real value indicating the influence of a condition on
(e.g. a museum) the item’s ratings (positive, neutral or negative)
Contextual Natural Natural Natural Walking Museum Museum
Condition Park 1 Park 2 Park 3 Route 1 1 2
sunny - -
family - -
rainy
9
11. Similarities among contexts are calculated
from the similarities of their conditions
A context can be defined by a single or multiple
conditions
Two main scenarios when comparing two contexts:
1. Both contexts (c1,Values c1 a singleValues c2
Factors c2) have condition
Sim(c1,c2) = sim1
Weather unknown sunny
sim1
Mood happy unknown
Day unknown unknown
week
2. At Factors
least one of the contexts has multiple c2
Values c1 Values conditions
Sim(c1,c2) =
Weather unknown sunny pairwise
combination
Mood happy sad
Day weekend unknown
week 11
12. A similarity threshold is used to decide if
two contexts are semantically similar
This threshold can be learnt at different granularities
We have experimented with two methods for learning
the optimal threshold (β) from training data:
Using a global threshold for all the possible target contexts
Using a different threshold per target context
Similar contexts
Sim(c1,c3)
Global Different threshold
c1 c2 c3 (β=0.5) (β1=0.7 β2=0.5 β3=0.3)
c1 1 0.6 0.4 c1 c2 None
Target
c2 0.7 1 0.2 c2 c1 c1
context
c3 0.4 0.1 1 c3 None c1
12
13. Our pre-filtering approach uses also ratings
whose context is similar to the target one
FOR EACH (user u, item i) in Y DO
Y target context IF exists THEN
(c*) ADD to X
ELSE
Ratings filtering GET all where Sim(c,c*) ≥ β
ADD rating average to X
X END IF
END FOR
2D prediction model
Matrix Factorization (MF) model learnt using
learning
Stochastic Gradient Descent
Rating predictions
Overall User Item
rating average bias bias
13
15. For the experimentation we used two real-
world contextually-tagged data sets
One data set is about an In-Car music recommender
Rating in context
Context is defined by a
single condition
15
16. For the experimentation we used two real-
world contextually-tagged data sets
The other data set is about a tourism recommender
Multiple ratings for the
same item and user but
in different conditions:
without context
if “sad”
if “happy”
by “public transport”
16
17. We compared the performance of our
approach with two other approaches
The 2 proposed variants of the semantic pre-filtering:
Sem-Pref-gt – using a global threshold
Sem-Pref-dt – using a different threshold per context
A traditional approach based on exact pre-filtering:
Exact-Pref: exact pre-filtering using the same MF model
A state-of-the-art context-aware MF approach:
CAMF-CC: multi-dimensional MF model
Bias of condition on the item’s type
17
18. The semantic pre-filtering using a global
threshold (Sem-Pref-gt) is the most accurate
Accuracy variation with respect to non-contextual MF
(higher values are better) RMSE
MAE
18
19. Future work
Research in progress…
New variants of the semantic pre-filtering approach
New methods for measuring the implicit semantics of
contextual conditions
Planned in the near future:
Novel semantic multidimensional approaches
Extend the evaluation by assessing the performance of the
proposed approaches for ranking recommendation
19
This work has been done with the collaboration of Francesco Ricci from the free university of bolzano and of Luigi Ceccaroni from Bdigital a technology centre in Barcelona
CARS are commonly classified in 3 main paradigms depending on how they exploit context. This picture shows the general process of each one. Prefiltering (most popular approach because it has a clear justification) Postfiltering (adjust recommendations) Contextual Modeling (context is incorporated directly in the model, multidimensional approaches) Common limitation of CARS (their need for a large data sets…), We present a novel solution to mitigate this limitation for pre-filtering, because this paraditm…
The reason why pre-filtering is very prone to sparsity-related problems. The reason is due to the fact that, in traditional prefiltering… This picture tries to illustrate how the traditional pre-filtering works. This method is too rigid and only works well when the target context has a large number of ratings.
Main idea of our solution. In this work, we have proposed a novel approach for pre-filtering that is based on the intuition that ratings acquired in contexts that have a similar effect on the users rating behavior as the target one can also be used for making recommendations. Following with the same example (assuming that family and weekend are semantically similar because they effect…) (in our approach, we may also use ratings acquired in those contexts)
Next, I will explain our approach in more detail. Firstly, I will talk about how we acquire the semantic similarities among conditions Then, I will describe how we exploit these similarities during pre-filtering And Finally, I will show the experiments we did to validate our approach
So first I’m going to talk about our method for semantics acquisition that is based on implicit semantics and the VSM
Definition of implicit semantics (also known as distributional semantics) The Implicit semantic similarity between two concepts is based on the distributional hypothesis which claims that… In IR usages are defined by portions of textual data that can be defined at different granularities
VSM is used to find the usage overlap between two concepts (this matrix shows an example of such a representation) When usages are text-based, each entry stores the frequency or a frequency-based weight of the concept . (implicit similarity is obtained by comparing the frequency vectors between two concepts) Show the example (wine vs glass good overlap; wine vs spoon bad overlap)
We want to capture the usage of a condition in terms of how it influences the user’s rating behavior. (item-centered measure, usages = items; entries = weights indicating the influence…) Show example (good overlap between sunny and family; bad overlap with rainy) We can find unobvious domain-specific semantic relations (sunny and family).
Now I’d like to move on to the next point where I will describe how we actually exploit these semantic similarities between conditions to decide if a context is similar enough to the target one to be considered during ratings filtering.
The context of a rating can be defined by a single condition or by the conjunction of multiple conditions. Considering this, there are two main scenarios when comparing two contexts: 1-both are single conditions (show example); 2- at leas one has multiple (pairwise comparison of the similarities among conditions is needed, show example)
To decide if two contexts are similar enough, we use a convenient similarity threshold Explain how this threshold can be obtain from training data (at different granularities; we proposed two methods; Depending on the method used, given a target context, different situations can be considered as similar enough. ) This example tries to illustrate this. The table on the left… and the table on the right shows… For instance, if the target context is c1, using…
Remember how context is used in prefiltering paradigm (this graphic shows this 3-step process…) This pseudo-code summarizes the proposed semantic approach for rating filtering. (in addition to… as in traditional approach, it also considers…; when having several ratings for the same item and user we compute the average) As 2D prediction model we used… This equation shows…
Now, let’s talk about the experiments we did to validate the proposed semantic pre-filtering approach.
We have used 2 real world data sets of rating with contextual information One is about an in-car music recommender (this picture shows a screenshot of the interface used for rating collection; a single per rating
The other data set we used is of a recommender of places of interest in the Bolzano surroundings. This screenshot shows the interface for rating collection used in this cased that is very similar to previous one. As you can see, using this interface the data sets have multiple ratings for the same item and user. In this example a user is asked to rate a caste in Bolzano without and in specific conditions like sad and happy
For the evaluation we compared the performance… The exact pre-filtering uses as 2D prediction model the same bias-based MF model as in the proposed semantic approach Explain the MD MF model proposed by Baltrunas et al. (consist of a standard bias-based MF model extended with additional contextual biases)
Explain bar chart We can see that the proposed semantic pre-filtering method using the global threshold, called semantic-Pref-gt, is the most accurate in both data sets. The other variant, using the different threshold per context, is slightly better than the traditional approach in terms of MAE but not for RMSE. Main Conclusion: These preliminary results, demonstrate that on the one hand, the proposed semantic pre-filtering effectively mitigates the sparsity-related problems of the exact pre-filtering, and that on the other hand, the proposed… can be a good alternative to the multidimensional approaches
Of course, this work can be improved in several ways. Currently we are investigating other variants of the proposed semantic pre-filtering (pairwise similarity strategies) User-center instead of item-centered How to exploit this implicit semantic similarities in multidimensional approaches Finally, extend the evaluation of the proposed semantic approach for ranking recommendation
If you have any comments or questions, I'll be happy to hear them