Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation

1,041 views

Published on

Slides of my PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation

Published in: Software, Education, Technology
  • Be the first to comment

PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation

  1. 1. Exploiting distributional semantics for Content-Based and Context-Aware Recommendation PhD in Artificial Intelligence Victor Codina Advisor: Luigi Ceccaroni Universitat Politècnica de Catalunya June, 2014
  2. 2. Information and choice overload problem 2
  3. 3. Recommender Systems help users to find the right items through recommendations 3
  4. 4. Recommender Systems are a widely adopted technology in many domains 4
  5. 5. Recommender system’s components 5 Knowledge base Recommender Engine User Interface Item data User data
  6. 6. Main families of recommendation models 6 Collaborative Filtering (CF) Content-Based (CB) Filtering Context-aware Recommendation (CARS) Item metadata Ratings Context Target user Target item Predicted rating LIMITATION: Low accuracy in data-sparsity scenarios
  7. 7.  Exploitation of explicit semantic relationships  to mitigate the data-sparsity problem Existing solution: use the knowledge contained in domain ontologies 77 Semantically-Enhanced CB Filtering Semantically-Enhanced CARS Item ontology attribute similarities Context ontology condition similarities castle monastery Historic building is-a sunny cloudy Weather is-a
  8. 8.  Building and mantaining ontologies is expensive  Ontologies are bounded by fixed representations  They may not suit the data Limitations of domain ontologies 8 rating dataontology ≠ domain expert
  9. 9.  Similarities automatically derived from the data itself  Advantages:  Collecting rating data is cheaper than building ontologies  Not bounded by a fixed knowledge representations  Fine-grained semantic similarities can be identified Key idea: exploit distributional semantics derived from rating data 9 rating data semantic similarities
  10. 10.  Question 1: Is it possible to enhance content-based recommendation by exploiting the distributional semantics of item attributes?  Question 2: Is it possible to enhance contextual recommendation by exploiting the distributional semantics of contextual conditions? Research questions 10
  11. 11. Outline 11 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  12. 12. Outline 12 Distributional Semantics Distributional hypothesis Semantic vector representation Distributional similarity measures Novel content-based approach (SCB) Novel context-aware approach (SPF)
  13. 13.  The meaning of a concept is captured by its usage Distributional Hypothesis: “concepts that share similar usages share similar meaning”  In Linguistics usages are regions of text: • document • paragraph • sentence Distributional hypothesis 13
  14. 14. Word s1 s2 s3 s4 s5 s6 s7 glass 2 1 0 1 0 2 0 wine 2 1 1 0 1 2 0 spoon 0 0 1 1 0 0 2 Semantic vector representation 14 frequency-based weight “sentence 1”
  15. 15.  Cosine similarity is the most popular measure  good accuracy in high-dimensional vector spaces  Advantage: it can be used in combination with dimensionality reduction techniques (SVD) Distributional similarity measure 15 Glass Wine Spoon
  16. 16. Outline 16 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  17. 17. Outline 17 Novel content-based approach (SCB) Limitations of traditional item-to-user profile matching Semantic item-to-user profile matching Experimental evaluation Content-based recommendation User-dependent distributional semantics Novel context-aware approach (SPF) Distributional Semantics
  18. 18.  IDEA: “show me more of the same I’ve liked” Content-based Recommendation 18 user profile Profile Learner Profile Matching target user’s ratings item metadata target item profile predicted rating
  19. 19.  Lack of semantics exploitation  Syntactically different attribute pairs are not considered  Hypothesis: profile matching can be enhanced by exploiting similarities between attributes Traditional item-to-user profile matching 19 Item Profile User profile 0.2 1 0.5 0 1 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5 score = 1 x 0.7
  20. 20.  Hypothesis: best-pairs is better for rating prediction and all-pairs is better for ranking prediction Semantic item-to-user profile matching 20 Item Profile 0.2 1 0.5 0 1 User Profile 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5 All-pairs strategyBest-pairs strategy 0.2 1 0.5 0 1 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5
  21. 21.  Assumption: two attributes are similar if several users are interested in them similarly Attribute User1 User2 User3 User4 User5 User6 User7 action 1 -0.7 0 0.9 0.1 -1 0 Bruce Willis 0.7 -0.8 0.5 0.8 0.4 -0.2 0 comedy -0.5 0.7 0.2 -1 0.9 0.8 0.5 Distributional semantics of item’s attributes derived from rating data 21 User6’s degree of interest in action movies (“-1” = strong dislike, “1” = strong like)
  22. 22.  Rating data set statistics before and after pruning: Evaluation using MovieLens data set 22 Original Pruned Users 2.113 2.113 Movies 10.197 1.646 Attributes 6 4 Attributes values 13.367 3.105 Ratings per user 404 235 Sparsity 96% 86%
  23. 23. Best-pairs Vs. All-pairs 23 % = Improvement with respect to the traditional CB profile matching Best-pairs All-pairs (the higher, the better) Rating prediction Ranking prediction
  24. 24. Distributional Vs. Ontology semantics 24 Ranking prediction % = Improvement with respect to the traditional CB profile matching (the higher, the better)
  25. 25. SCB Vs. State of the art 25 SCB (proposed method) SVD++ BPR-MF Rating prediction Ranking prediction % = Improvement with respect to the traditional CB profile matching
  26. 26. Outline 26 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  27. 27. Outline 27 Novel context-aware approach (SPF) Limitations of traditional contextual pre-filtering Semantic pre-filtering approach Experimental evaluation Context-aware recommendation Rating-based distributional semantics of conditions Novel content-based approach (SCB) Distributional Semantics
  28. 28. Context matters 28  Assumption: user’s experience depend on context
  29. 29. 29 Context-aware recommendation  Context as additional dimension for estimation  Three main context-aware recommender families target context predicted ratingPrediction model in-context ratings target Item target user Pre-filtering Post-filteringContextual modeling
  30. 30.  Main limitation: its lack of flexibility  Only ratings acquired in exactly the same context are used  Hypothesis: ratings filtering can be enhanced by exploiting semantic similarities between contexts Traditional contextual pre-filtering 30 local ratings in-context ratings Ratings filtering Prediction model target context predicted rating
  31. 31.  Key idea: reuse ratings acquired in similar contexts Semantic contextual pre-filtering 31 local ratings Ratings filtering Prediction model ≈ ≠semantic similarities in-context ratings target context global threshold predicted rating
  32. 32. Distributional semantics of contextual conditions derived from rating data 32  Assumption: two contexts are similar if their composing conditions influence ratings similarly Condition User1 User2 User3 User4 User5 User6 User7 1 -0.7 0 0.9 0.1 -0.6 0 0.7 -0.8 0.5 0.8 0.4 -0.2 0 -0.5 0.7 0.2 -1 0.9 0.8 0.5 Influence of family context in User6’s ratings (“<0” = negative, “0” = neutral, “>0” = positive)
  33. 33.  Six in-context rating data sets on diverse domains: Evaluation data sets UMAP – June 2013, Rome, Italy 33 Datasets Ratings Conditions Context granularity Music 4013 26 1 Tourism 1358 57 3 Adom 1464 14 3 Comoda 2296 49 12 Movie 2190 29 2 Library 609K 149 4
  34. 34. Semantic Vs. traditional pre-filtering 34 % = MAE reduction with respect to a context-free MF model (the higher, the better) Semantic Traditional
  35. 35. SPF Vs. State of the art 35 % = MAE reduction with respect to the context-free MF model (the higher, the better) SPF (proposed method) UI-Splitting CAMF
  36. 36. Main contributions 36 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  37. 37.  Method for computing the distributional semantics of item’s attributes  Two strategies for exploiting the semantic similarities during profile matching Main contributions (II) 37 Semantic Content-Based filtering (SCB)
  38. 38.  Better accuracy than state of the art in new user scenarios Main contributions (III) 38 Semantic Content-Based filtering (SCB) SCB (proposed method) 38 Ranking predictionRating prediction
  39. 39.  Method for computing the distributional semantics of contextual conditions  Novel semantic pre-filtering method that reuses ratings in semantically similar contexts Main contributions (IV) 39 Semantic Contextual Pre-filtering (SPF)
  40. 40.  Better accuracy than state of the art Main contributions (V) 40 Semantic Contextual Pre-filtering (SPF) SPF
  41. 41. Question 1? YES. It is possible to enhance content-based recommendation by exploiting the distributional semantics of item’s attributes Question 2? YES. It is possible to enhance context-aware recommendation by exploiting the distributional semantics of contextual conditions Conclusions 41
  42. 42.  Conference papers:  CCIA 2010: Codina, V. & Ceccaroni, L. Taking advantage of semantics…  DCAI 2010: Codina, V., & Ceccaroni, L. A Recommendation System for the…  CCIA 2011: Codina, V., & Ceccaroni, L. Extending Recommendation Systems with…  CCIA 2012: Codina, V., & Ceccaroni, L. Semantically-Enhanced Recommenders  CARR 2013: Codina et al. Semantically-enhanced pre-filtering for…  UMAP 2013: Codina et al. Exploiting the Semantic Similarity of Contextual…  RecSys 2013: Codina et al. Local Context Modeling with Semantic Pre-filtering  Journal paper:  UMUAI (User Modeling and User-Adapted Interaction journal): Codina et al. Distributional Semantic Pre-filtering in Context-Aware Recommender Systems. 2012 Impact Factor: 1.600 (current status: accepted) Publications related to the thesis 42

×