SlideShare a Scribd company logo
1 of 20
Semantically-Enhanced Pre-Filtering
 for Context-Aware Recommender
             Systems
                  CaRR 2013


     Victor Codina

     Francesco Ricci

     Luigi Ceccaroni
Main paradigms for exploiting
contextual information in CARS




                         [Adomavicius & Tuzhilin, 2010]
                                                          2
Pre-filtering is very prone to suffer from
sparsity-related problems

 In traditional pre-filtering only the ratings acquired in
  exactly the same context of the target user are used

 An example (places of interest recommender):
    target context = sunny

                                   Only this set of ratings would be used
         family
                       sunny       Insight: Perhaps we can also use ratings
                                   acquired in similar contexts…

              rainy      weekend

                                                                            3
Our solution is based on using also ratings
acquired in semantically similar contexts

 We consider two contexts as semantically similar if they
 have a similar effect on the users rating behavior

 Same example:                            We know that sunny, family
                                          and weekend influence
Target context = sunny                    positively ratings for outdoor
                                          places (e.g. natural parks)
                         family
                                          sunny



                                                     weekend
                                  rainy
                                                                           4
Outline



Semantics acquisition


Semantics exploitation


Experiments




                                  5
Outline


                                  Implicit semantics
Semantics acquisition             Vector Space Model
                                  Similarity calculation

Semantics exploitation


Experiments




                                                           6
The implicit semantic similarity is based on
the distributional hypothesis

 In implicit semantics the meaning of a concept is
 captured by its usage

                   distributional hypothesis:
  “concepts that share similar usages share similar meaning”


 In Information Retrieval (IR) usages are portions of
 textual data:
     • document
     • paragraph
     • sentence
                                                               7
The Vector Space Model (VSM) is normally
used to find the implicit semantic similarity

 Two concepts are similar if their usages overlap
 An example of concept-usage matrix in IR (WordSpace)
      usage = sentence (s1)              frequency-based weight

     Concept      s1      s2   s3   s4   s5     s6      s7


  glass            1      1    0    1    0       2       0


  wine             2      1    0    0    1       2       0

  spoon
                   0      1    1    1    0       0       2
                                                              8
Two conditions are semantically similar if
they influence the user’s ratings similarly

 We measure the influence of a condition from a
 item-centered perspective
      usage = item         Real value indicating the influence of a condition on
      (e.g. a museum)      the item’s ratings (positive, neutral or negative)

   Contextual    Natural   Natural   Natural   Walking Museum Museum
   Condition     Park 1    Park 2    Park 3    Route 1   1      2

  sunny                                                     -          -
  family                                          -         -
  rainy

                                                                            9
Outline



Semantics acquisition

                                  Comparing contexts
Semantics exploitation
                                  Pre-filtering algorithm

Experiments




                                                            10
Similarities among contexts are calculated
from the similarities of their conditions
 A context can be defined by a single or multiple
 conditions
 Two main scenarios when comparing two contexts:
    1. Both contexts (c1,Values c1 a singleValues c2
         Factors          c2) have          condition
                                                        Sim(c1,c2) = sim1
        Weather         unknown            sunny
                                   sim1
         Mood             happy           unknown
          Day           unknown           unknown
          week

    2. At Factors
          least one of the contexts has multiple c2
                         Values c1        Values conditions
                                                         Sim(c1,c2) =
        Weather         unknown            sunny             pairwise
                                                           combination
         Mood             happy             sad
          Day           weekend           unknown
          week                                                           11
A similarity threshold is used to decide if
two contexts are semantically similar
 This threshold can be learnt at different granularities
 We have experimented with two methods for learning
 the optimal threshold (β) from training data:
    Using a global threshold for all the possible target contexts
    Using a different threshold per target context
                               Similar contexts
               Sim(c1,c3)
                                           Global    Different threshold
         c1   c2    c3                     (β=0.5) (β1=0.7 β2=0.5 β3=0.3)
   c1    1    0.6   0.4               c1     c2             None
                               Target
   c2   0.7    1    0.2               c2     c1               c1
                              context
   c3   0.4   0.1    1                c3   None               c1
                                                                            12
Our pre-filtering approach uses also ratings
whose context is similar to the target one
                             FOR EACH (user u, item i) in Y DO
    Y         target context    IF        exists THEN
                    (c*)            ADD          to X
                                ELSE
  Ratings filtering                 GET all       where Sim(c,c*) ≥ β
                                    ADD rating average to X
         X                      END IF
                             END FOR
2D prediction model
                            Matrix Factorization (MF) model learnt using
      learning
                            Stochastic Gradient Descent
 Rating predictions

                              Overall       User     Item
                           rating average   bias     bias
                                                                           13
Outline



Semantics acquisition


Semantics exploitation

                              Data sets
Experiments                   Considered prediction models
                              Experimental results




                                                             14
For the experimentation we used two real-
world contextually-tagged data sets
  One data set is about an In-Car music recommender

              Rating in context




                        Context is defined by a
                        single condition



                                                      15
For the experimentation we used two real-
world contextually-tagged data sets
  The other data set is about a tourism recommender




                                           Multiple ratings for the
                                           same item and user but
                                           in different conditions:
                                              without context


                                              if “sad”
                                              if “happy”
                                              by “public transport”
                                                                 16
We compared the performance of our
approach with two other approaches

 The 2 proposed variants of the semantic pre-filtering:
    Sem-Pref-gt – using a global threshold
    Sem-Pref-dt – using a different threshold per context
 A traditional approach based on exact pre-filtering:
    Exact-Pref: exact pre-filtering using the same MF model
 A state-of-the-art context-aware MF approach:
    CAMF-CC: multi-dimensional MF model



                Bias of condition on the item’s type
                                                              17
The semantic pre-filtering using a global
threshold (Sem-Pref-gt) is the most accurate
      Accuracy variation with respect to non-contextual MF
                    (higher values are better)               RMSE




                                                             MAE




                                                                    18
Future work
Research in progress…
  New variants of the semantic pre-filtering approach
  New methods for measuring the implicit semantics of
  contextual conditions


Planned in the near future:
  Novel semantic multidimensional approaches
  Extend the evaluation by assessing the performance of the
  proposed approaches for ranking recommendation


                                                              19
Semantically-Enhanced Pre-Filtering
 for Context-Aware Recommender
             Systems
      Any comments or questions?


              Victor Codina

           vcodina@lsi.upc.edu

More Related Content

Similar to CARR’13 - Semantically-Enhanced Pre-Filtering for CARS

Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)
Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)
Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)Jia-Bin Huang
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessVictor Codina
 
Multivariate analyses & decoding
Multivariate analyses & decodingMultivariate analyses & decoding
Multivariate analyses & decodingkhbrodersen
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...GiuseppeSpillo
 
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersHussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersKalle
 
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...YONG ZHENG
 

Similar to CARR’13 - Semantically-Enhanced Pre-Filtering for CARS (7)

Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)
Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)
Moving Cast Shadow Detection Using Physics-based Features (CVPR 2009)
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
 
Multivariate analyses & decoding
Multivariate analyses & decodingMultivariate analyses & decoding
Multivariate analyses & decoding
 
Isvc08
Isvc08Isvc08
Isvc08
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersHussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
 
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
[CARS2012@RecSys]Optimal Feature Selection for Context-Aware Recommendation u...
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

CARR’13 - Semantically-Enhanced Pre-Filtering for CARS

  • 1. Semantically-Enhanced Pre-Filtering for Context-Aware Recommender Systems CaRR 2013 Victor Codina Francesco Ricci Luigi Ceccaroni
  • 2. Main paradigms for exploiting contextual information in CARS [Adomavicius & Tuzhilin, 2010] 2
  • 3. Pre-filtering is very prone to suffer from sparsity-related problems In traditional pre-filtering only the ratings acquired in exactly the same context of the target user are used An example (places of interest recommender): target context = sunny Only this set of ratings would be used family sunny Insight: Perhaps we can also use ratings acquired in similar contexts… rainy weekend 3
  • 4. Our solution is based on using also ratings acquired in semantically similar contexts We consider two contexts as semantically similar if they have a similar effect on the users rating behavior Same example: We know that sunny, family and weekend influence Target context = sunny positively ratings for outdoor places (e.g. natural parks) family sunny weekend rainy 4
  • 6. Outline Implicit semantics Semantics acquisition Vector Space Model Similarity calculation Semantics exploitation Experiments 6
  • 7. The implicit semantic similarity is based on the distributional hypothesis In implicit semantics the meaning of a concept is captured by its usage distributional hypothesis: “concepts that share similar usages share similar meaning” In Information Retrieval (IR) usages are portions of textual data: • document • paragraph • sentence 7
  • 8. The Vector Space Model (VSM) is normally used to find the implicit semantic similarity Two concepts are similar if their usages overlap An example of concept-usage matrix in IR (WordSpace) usage = sentence (s1) frequency-based weight Concept s1 s2 s3 s4 s5 s6 s7 glass 1 1 0 1 0 2 0 wine 2 1 0 0 1 2 0 spoon 0 1 1 1 0 0 2 8
  • 9. Two conditions are semantically similar if they influence the user’s ratings similarly We measure the influence of a condition from a item-centered perspective usage = item Real value indicating the influence of a condition on (e.g. a museum) the item’s ratings (positive, neutral or negative) Contextual Natural Natural Natural Walking Museum Museum Condition Park 1 Park 2 Park 3 Route 1 1 2 sunny - - family - - rainy 9
  • 10. Outline Semantics acquisition Comparing contexts Semantics exploitation Pre-filtering algorithm Experiments 10
  • 11. Similarities among contexts are calculated from the similarities of their conditions A context can be defined by a single or multiple conditions Two main scenarios when comparing two contexts: 1. Both contexts (c1,Values c1 a singleValues c2 Factors c2) have condition Sim(c1,c2) = sim1 Weather unknown sunny sim1 Mood happy unknown Day unknown unknown week 2. At Factors least one of the contexts has multiple c2 Values c1 Values conditions Sim(c1,c2) = Weather unknown sunny pairwise combination Mood happy sad Day weekend unknown week 11
  • 12. A similarity threshold is used to decide if two contexts are semantically similar This threshold can be learnt at different granularities We have experimented with two methods for learning the optimal threshold (β) from training data: Using a global threshold for all the possible target contexts Using a different threshold per target context Similar contexts Sim(c1,c3) Global Different threshold c1 c2 c3 (β=0.5) (β1=0.7 β2=0.5 β3=0.3) c1 1 0.6 0.4 c1 c2 None Target c2 0.7 1 0.2 c2 c1 c1 context c3 0.4 0.1 1 c3 None c1 12
  • 13. Our pre-filtering approach uses also ratings whose context is similar to the target one FOR EACH (user u, item i) in Y DO Y target context IF exists THEN (c*) ADD to X ELSE Ratings filtering GET all where Sim(c,c*) ≥ β ADD rating average to X X END IF END FOR 2D prediction model Matrix Factorization (MF) model learnt using learning Stochastic Gradient Descent Rating predictions Overall User Item rating average bias bias 13
  • 14. Outline Semantics acquisition Semantics exploitation Data sets Experiments Considered prediction models Experimental results 14
  • 15. For the experimentation we used two real- world contextually-tagged data sets One data set is about an In-Car music recommender Rating in context Context is defined by a single condition 15
  • 16. For the experimentation we used two real- world contextually-tagged data sets The other data set is about a tourism recommender Multiple ratings for the same item and user but in different conditions: without context if “sad” if “happy” by “public transport” 16
  • 17. We compared the performance of our approach with two other approaches The 2 proposed variants of the semantic pre-filtering: Sem-Pref-gt – using a global threshold Sem-Pref-dt – using a different threshold per context A traditional approach based on exact pre-filtering: Exact-Pref: exact pre-filtering using the same MF model A state-of-the-art context-aware MF approach: CAMF-CC: multi-dimensional MF model Bias of condition on the item’s type 17
  • 18. The semantic pre-filtering using a global threshold (Sem-Pref-gt) is the most accurate Accuracy variation with respect to non-contextual MF (higher values are better) RMSE MAE 18
  • 19. Future work Research in progress… New variants of the semantic pre-filtering approach New methods for measuring the implicit semantics of contextual conditions Planned in the near future: Novel semantic multidimensional approaches Extend the evaluation by assessing the performance of the proposed approaches for ranking recommendation 19
  • 20. Semantically-Enhanced Pre-Filtering for Context-Aware Recommender Systems Any comments or questions? Victor Codina vcodina@lsi.upc.edu

Editor's Notes

  1. This work has been done with the collaboration of Francesco Ricci from the free university of bolzano and of Luigi Ceccaroni from Bdigital a technology centre in Barcelona
  2. CARS are commonly classified in 3 main paradigms depending on how they exploit context. This picture shows the general process of each one. Prefiltering (most popular approach because it has a clear justification) Postfiltering (adjust recommendations) Contextual Modeling (context is incorporated directly in the model, multidimensional approaches) Common limitation of CARS (their need for a large data sets…), We present a novel solution to mitigate this limitation for pre-filtering, because this paraditm…
  3. The reason why pre-filtering is very prone to sparsity-related problems. The reason is due to the fact that, in traditional prefiltering… This picture tries to illustrate how the traditional pre-filtering works. This method is too rigid and only works well when the target context has a large number of ratings.
  4. Main idea of our solution. In this work, we have proposed a novel approach for pre-filtering that is based on the intuition that ratings acquired in contexts that have a similar effect on the users rating behavior as the target one can also be used for making recommendations. Following with the same example (assuming that family and weekend are semantically similar because they effect…) (in our approach, we may also use ratings acquired in those contexts)
  5. Next, I will explain our approach in more detail. Firstly, I will talk about how we acquire the semantic similarities among conditions Then, I will describe how we exploit these similarities during pre-filtering And Finally, I will show the experiments we did to validate our approach
  6. So first I’m going to talk about our method for semantics acquisition that is based on implicit semantics and the VSM
  7. Definition of implicit semantics (also known as distributional semantics) The Implicit semantic similarity between two concepts is based on the distributional hypothesis which claims that… In IR usages are defined by portions of textual data that can be defined at different granularities
  8. VSM is used to find the usage overlap between two concepts (this matrix shows an example of such a representation) When usages are text-based, each entry stores the frequency or a frequency-based weight of the concept . (implicit similarity is obtained by comparing the frequency vectors between two concepts) Show the example (wine vs glass good overlap; wine vs spoon bad overlap)
  9. We want to capture the usage of a condition in terms of how it influences the user’s rating behavior. (item-centered measure, usages = items; entries = weights indicating the influence…) Show example (good overlap between sunny and family; bad overlap with rainy) We can find unobvious domain-specific semantic relations (sunny and family).
  10. Now I’d like to move on to the next point where I will describe how we actually exploit these semantic similarities between conditions to decide if a context is similar enough to the target one to be considered during ratings filtering.
  11. The context of a rating can be defined by a single condition or by the conjunction of multiple conditions. Considering this, there are two main scenarios when comparing two contexts: 1-both are single conditions (show example); 2- at leas one has multiple (pairwise comparison of the similarities among conditions is needed, show example)
  12. To decide if two contexts are similar enough, we use a convenient similarity threshold Explain how this threshold can be obtain from training data (at different granularities; we proposed two methods; Depending on the method used, given a target context, different situations can be considered as similar enough. ) This example tries to illustrate this. The table on the left… and the table on the right shows… For instance, if the target context is c1, using…
  13. Remember how context is used in prefiltering paradigm (this graphic shows this 3-step process…) This pseudo-code summarizes the proposed semantic approach for rating filtering. (in addition to… as in traditional approach, it also considers…; when having several ratings for the same item and user we compute the average) As 2D prediction model we used… This equation shows…
  14. Now, let’s talk about the experiments we did to validate the proposed semantic pre-filtering approach.
  15. We have used 2 real world data sets of rating with contextual information One is about an in-car music recommender (this picture shows a screenshot of the interface used for rating collection; a single per rating
  16. The other data set we used is of a recommender of places of interest in the Bolzano surroundings. This screenshot shows the interface for rating collection used in this cased that is very similar to previous one. As you can see, using this interface the data sets have multiple ratings for the same item and user. In this example a user is asked to rate a caste in Bolzano without and in specific conditions like sad and happy
  17. For the evaluation we compared the performance… The exact pre-filtering uses as 2D prediction model the same bias-based MF model as in the proposed semantic approach Explain the MD MF model proposed by Baltrunas et al. (consist of a standard bias-based MF model extended with additional contextual biases)
  18. Explain bar chart We can see that the proposed semantic pre-filtering method using the global threshold, called semantic-Pref-gt, is the most accurate in both data sets. The other variant, using the different threshold per context, is slightly better than the traditional approach in terms of MAE but not for RMSE. Main Conclusion: These preliminary results, demonstrate that on the one hand, the proposed semantic pre-filtering effectively mitigates the sparsity-related problems of the exact pre-filtering, and that on the other hand, the proposed… can be a good alternative to the multidimensional approaches
  19. Of course, this work can be improved in several ways. Currently we are investigating other variants of the proposed semantic pre-filtering (pairwise similarity strategies) User-center instead of item-centered How to exploit this implicit semantic similarities in multidimensional approaches Finally, extend the evaluation of the proposed semantic approach for ranking recommendation
  20. If you have any comments or questions, I'll be happy to hear them