Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS Slides]

211 views

Published on

Slides of the presentation given at the Doctoral Symposium of ACM RecSys 2015. The paper is entitled:

Daniel Valcarce: Exploring Statistical Language Models for Recommender Systems. RecSys 2015: 375-378

http://doi.acm.org/10.1145/2792838.2796547

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS Slides]

  1. 1. DOCTORAL SYMPOSIUM Exploring Statistical Language Models for Recommender Systems RecSys 2015 16 - 20 September, Vienna, Austria Daniel Valcarce @dvalcarce Information Retrieval Lab University of A Coruña Spain
  2. 2. Motivation 1
  3. 3. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Information Filtering (IF) 2
  4. 4. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Goal: Retrieve relevant documents according to the information need of a user Information Filtering (IF) 2
  5. 5. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Goal: Retrieve relevant documents according to the information need of a user Examples: Search engines (web, multimedia...) Information Filtering (IF) 2
  6. 6. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Goal: Retrieve relevant documents according to the information need of a user Examples: Search engines (web, multimedia...) Information Filtering (IF) Goal: Select relevant items from an information stream for a given user 2
  7. 7. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Goal: Retrieve relevant documents according to the information need of a user Examples: Search engines (web, multimedia...) Information Filtering (IF) Goal: Select relevant items from an information stream for a given user Examples: spam filters, recommender systems 2
  8. 8. Information Retrieval vs Information Filtering (1) Information Retrieval (IR) Goal: Retrieve relevant documents according to the information need of a user Examples: Search engines (web, multimedia...) Input: The user’s query (explicit). Information Filtering (IF) Goal: Select relevant items from an information stream for a given user Examples: spam filters, recommender systems Input: The user’s history (implicit). 2
  9. 9. Information Retrieval vs Information Filtering (2) Some people consider them different fields: U. Hanani, B. Shapira and P. Shoval: Information Filtering: Overview of Issues, Research and Systems in User Modeling and User-Adapted Interaction (2001) 3
  10. 10. Information Retrieval vs Information Filtering (2) Some people consider them different fields: U. Hanani, B. Shapira and P. Shoval: Information Filtering: Overview of Issues, Research and Systems in User Modeling and User-Adapted Interaction (2001) While other consider them the same thing: N. J. Belkin and W. B. Croft: Information filtering and information retrieval: two sides of the same coin? in Communications of the ACM (1992) 3
  11. 11. Information Retrieval vs Information Filtering (2) Some people consider them different fields: U. Hanani, B. Shapira and P. Shoval: Information Filtering: Overview of Issues, Research and Systems in User Modeling and User-Adapted Interaction (2001) While other consider them the same thing: N. J. Belkin and W. B. Croft: Information filtering and information retrieval: two sides of the same coin? in Communications of the ACM (1992) What is undeniable is that they are closely related: Why not apply techniques from one field to the other? 3
  12. 12. Information Retrieval vs Information Filtering (2) Some people consider them different fields: U. Hanani, B. Shapira and P. Shoval: Information Filtering: Overview of Issues, Research and Systems in User Modeling and User-Adapted Interaction (2001) While other consider them the same thing: N. J. Belkin and W. B. Croft: Information filtering and information retrieval: two sides of the same coin? in Communications of the ACM (1992) What is undeniable is that they are closely related: Why not apply techniques from one field to the other? It has already been done! 3
  13. 13. Information Retrieval vs Information Filtering (3) Information Retrieval (IR) Some retrieval techniques are: Information Filtering (IF) Some CF techniques are: 4
  14. 14. Information Retrieval vs Information Filtering (3) Information Retrieval (IR) Some retrieval techniques are: Vector: Vector Space Model Information Filtering (IF) Some CF techniques are: Vector: Pairwise similarities (cosine, Pearson) 4
  15. 15. Information Retrieval vs Information Filtering (3) Information Retrieval (IR) Some retrieval techniques are: Vector: Vector Space Model MF: Latent Semantic Indexing (LSI) Information Filtering (IF) Some CF techniques are: Vector: Pairwise similarities (cosine, Pearson) MF: SVD, NMF 4
  16. 16. Information Retrieval vs Information Filtering (3) Information Retrieval (IR) Some retrieval techniques are: Vector: Vector Space Model MF: Latent Semantic Indexing (LSI) Probabilistic: LDA Information Filtering (IF) Some CF techniques are: Vector: Pairwise similarities (cosine, Pearson) MF: SVD, NMF Probabilistic: LDA and other PGMs 4
  17. 17. Information Retrieval vs Information Filtering (3) Information Retrieval (IR) Some retrieval techniques are: Vector: Vector Space Model MF: Latent Semantic Indexing (LSI) Probabilistic: LDA, Language Models (LM) Information Filtering (IF) Some CF techniques are: Vector: Pairwise similarities (cosine, Pearson) MF: SVD, NMF Probabilistic: LDA and other PGMs 4
  18. 18. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation 5
  19. 19. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation Maybe they can also be useful in RecSys: 5
  20. 20. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation Maybe they can also be useful in RecSys: Are LM a good framework for Collaborative Filtering? 5
  21. 21. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation Maybe they can also be useful in RecSys: Are LM a good framework for Collaborative Filtering? Can LM be adapted to deal with temporal (TARS) and/or contextual information (CARS)? 5
  22. 22. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation Maybe they can also be useful in RecSys: Are LM a good framework for Collaborative Filtering? Can LM be adapted to deal with temporal (TARS) and/or contextual information (CARS)? A principled formulation of LM that combines Content-Based and Collaborative Filtering? 5
  23. 23. Language Models for Recommendation: Research goals Language Models (LM) represented a breakthrough in Information Retrieval: State-of-the-art technique for text retrieval Solid statistical foundation Maybe they can also be useful in RecSys: Are LM a good framework for Collaborative Filtering? Can LM be adapted to deal with temporal (TARS) and/or contextual information (CARS)? A principled formulation of LM that combines Content-Based and Collaborative Filtering? 5
  24. 24. Language Models for Recommendation: Related work There is little work done in using Language Models for CF: J. Wang, A. P. de Vries and M. J. Reinders: A User-Item Relevance Model for Log-based Collaborative Filtering in ECIR 2006 6
  25. 25. Language Models for Recommendation: Related work There is little work done in using Language Models for CF: J. Wang, A. P. de Vries and M. J. Reinders: A User-Item Relevance Model for Log-based Collaborative Filtering in ECIR 2006 A. Bellogín, J. Wang and P. Castells: Bridging Memory-Based Collaborative Filtering and Text Retrieval in Information Retrieval (2013) 6
  26. 26. Language Models for Recommendation: Related work There is little work done in using Language Models for CF: J. Wang, A. P. de Vries and M. J. Reinders: A User-Item Relevance Model for Log-based Collaborative Filtering in ECIR 2006 A. Bellogín, J. Wang and P. Castells: Bridging Memory-Based Collaborative Filtering and Text Retrieval in Information Retrieval (2013) J. Parapar, A. Bellogín, P. Castells and Á. Barreiro: Relevance-Based Language Modelling for Recommender Systems in Information Processing & Management (2013) 6
  27. 27. Language Models for Recommendation: Related work There is little work done in using Language Models for CF: J. Wang, A. P. de Vries and M. J. Reinders: A User-Item Relevance Model for Log-based Collaborative Filtering in ECIR 2006 A. Bellogín, J. Wang and P. Castells: Bridging Memory-Based Collaborative Filtering and Text Retrieval in Information Retrieval (2013) J. Parapar, A. Bellogín, P. Castells and Á. Barreiro: Relevance-Based Language Modelling for Recommender Systems in Information Processing & Management (2013) 6
  28. 28. Relevance-Based Language Models for Collaborative Filtering 6
  29. 29. Relevance-Based Language Models Relevance-Based Language Models or Relevance Models (RM) are a pseudo-relevance feedback technique from IR. Pseudo-relevance feedback is an automatic query expansion technique. The expanded query is expected to yield better results than the original one. 7
  30. 30. Pseudo-relevance feedback Information need 8
  31. 31. Pseudo-relevance feedback Information need query 8
  32. 32. Pseudo-relevance feedback Information need query Retrieval System 8
  33. 33. Pseudo-relevance feedback Information need query Retrieval System 8
  34. 34. Pseudo-relevance feedback Information need query Retrieval System 8
  35. 35. Pseudo-relevance feedback Information need query Retrieval System 8
  36. 36. Pseudo-relevance feedback Information need query Retrieval System Query Expansion expanded query 8
  37. 37. Pseudo-relevance feedback Information need query Retrieval System Query Expansion expanded query 8
  38. 38. Relevance-Based Language Models for CF Recommendation (1) IR RecSys User’s query User’s profile mostˆ1,populatedˆ1,stateˆ2 Titanicˆ2,Avatarˆ3,Sharkˆ5 Documents Neighbours Terms Items 9
  39. 39. Relevance-Based Language Models for CF Recommendation (2) Parapar et al. (2013): RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u Vu is neighbourhood of the user u. This is computed using a clustering algorithm p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection p(i) and p(v) are the item and user priors 10
  40. 40. Relevance-Based Language Models for CF Recommendation (2) Parapar et al. (2013): RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u Vu is neighbourhood of the user u. This is computed using a clustering algorithm p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection p(i) and p(v) are the item and user priors 10
  41. 41. Relevance-Based Language Models for CF Recommendation (2) Parapar et al. (2013): RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u Vu is neighbourhood of the user u. This is computed using a clustering algorithm p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection p(i) and p(v) are the item and user priors 10
  42. 42. Relevance-Based Language Models for CF Recommendation (2) Parapar et al. (2013): RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u Vu is neighbourhood of the user u. This is computed using a clustering algorithm p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection p(i) and p(v) are the item and user priors 10
  43. 43. Smoothing methods 10
  44. 44. Smoothing in RM2 RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) p(i|u) is computed smoothing the maximum likelihood estimate: pml(i|u) = ru,i j∈Iu ru,j with the probability in the collection: p(i|C) = v∈U rv,i j∈I, v∈U rv,j 11
  45. 45. Why use smoothing? In Information Retrieval, smoothing provides: A way to deal with data sparsity The inverse document frequency (IDF) role Document length normalisation 12
  46. 46. Why use smoothing? In Information Retrieval, smoothing provides: A way to deal with data sparsity The inverse document frequency (IDF) role Document length normalisation In RecSys, we have the same problems: Data sparsity Item popularity vs item specificity Profiles with different lengths 12
  47. 47. Smoothing techniques Jelinek-Mercer (JM): Linear interpolation. Parameter λ. pλ(i|u) = (1 − λ) pml(i|u) + λ p(i|C) Dirichlet priors (DP): Bayesian analysis. Parameter µ. pµ(i|u) = ru,i + µ p(i|C) µ + j∈Iu ru,j Absolute Discounting (AD): Subtract a constant δ. pδ(i|u) = max(ru,i − δ, 0) + δ |Iu| p(i|C) j∈Iu ru,j 13
  48. 48. Experiments with smoothing 13
  49. 49. Smoothing: ranking accuracy 0.20 0.25 0.30 0.35 0 100 200 300 400 500 600 700 800 900 1000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 nDCG@10 µ λ, δ RM2 + AD RM2 + JM RM2 + DP Figure: nDCG@10 values of RM2 varying the smoothing method using 400 nearest neighbours according to Pearson’s correlation on MovieLens 100k dataset 14
  50. 50. Smoothing: diversity 0.010 0.015 0.020 0.025 0.030 0 100 200 300 400 500 600 700 800 900 1000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Gini@10 µ λ, δ RM2 + AD RM2 + JM RM2 + DP Figure: Gini@10 values of RM2 varying the smoothing method using 400 nearest neighbours according to Pearson’s correlation on MovieLens 100k dataset 15
  51. 51. Smoothing: novelty 7.5 8.0 8.5 9.0 9.5 0 100 200 300 400 500 600 700 800 900 1000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 MSI@10 µ λ, δ RM2 + AD RM2 + JM RM2 + DP Figure: MSI@10 values of RM2 varying the smoothing method using 400 nearest neighbours according to Pearson’s correlation on MovieLens 100k dataset 16
  52. 52. More about smoothings in RM2 for CF More details about smoothings in: D. Valcarce, J. Parapar, Á. Barreiro: A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems in ECIR 2015 17
  53. 53. Priors 17
  54. 54. Priors in RM2 RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) p(i) and p(v) are the item and user priors: Enable to introduce a priori information into the model 18
  55. 55. Priors in RM2 RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) p(i) and p(v) are the item and user priors: Enable to introduce a priori information into the model Provide a principled way of modelling business rules! 18
  56. 56. Prior estimates Uniform (U) Linear (L) User prior pU(u) = 1 |U| pL(u) = i∈Iu ru,i v∈U j∈Iv rv,j Item prior pU(i) = 1 |I| pL(i) = u∈Ui ru,i j∈I v∈Uj rv,j 19
  57. 57. Experiments with priors 19
  58. 58. Priors on MovieLens 100k User prior Item prior nDCG@10 Gini@10 MSI@10 Linear Linear 0.0922 0.4603 28.4284 Uniform Linear 0.2453 0.2027 16.4022 Uniform Uniform 0.3296 0.0256 6.8273 Linear Uniform 0.3423 0.0264 6.7848 Table: nDCG@10, Gini@10 and MSI@10 values of RM2 varying the prior estimates using 400 nearest neighbours according to Pearson’s correlation on MovieLens 100k dataset and Absolute Discounting (δ = 0.1) More priors in D. Valcarce, J. Parapar and Á. Barreiro: A Study of Priors for Relevance-Based Language Modelling of Recommender Systems in RecSys 2015! 20
  59. 59. Comparison with other CF algorithms 20
  60. 60. Priors on MovieLens 100k Algorithm nDCG@10 Gini@10 MSI@10 SVD 0.0946 0.0109 14.6129 SVD++ 0.1113 0.0126 14.9574 NNCosNgbr 0.1771 0.0344 16.8222 UIR-Item 0.2188 0.0124 5.2337 PureSVD 0.3595 0.1364 11.8841 RM2-JM 0.3175 0.0232 9.1087 RM2-DP 0.3274 0.0251 9.2181 RM2-AD 0.3296 0.0256 9.2409 RM2-AD-L-U 0.3423 0.0264 9.2004 Table: nDCG@10, Gini@10 and MSI@10 values of different CF recommendation algorithms 21
  61. 61. Conclusions and future directions 21
  62. 62. Conclusions IR techniques can be employed in RecSys Not only methods such as SVD... but also Language Models! 22
  63. 63. Conclusions IR techniques can be employed in RecSys Not only methods such as SVD... but also Language Models! Language Models provide a principled and interpretable framework for recommendation. 22
  64. 64. Conclusions IR techniques can be employed in RecSys Not only methods such as SVD... but also Language Models! Language Models provide a principled and interpretable framework for recommendation. Relevance-Based Language Models are competitive, but there is room for improvements: More sophisticated priors 22
  65. 65. Conclusions IR techniques can be employed in RecSys Not only methods such as SVD... but also Language Models! Language Models provide a principled and interpretable framework for recommendation. Relevance-Based Language Models are competitive, but there is room for improvements: More sophisticated priors Neighbourhood computation ◦ Different similarity metrics: cosine, Kullback–Leibler divergence ◦ Matrix factorisation: NMF, SVD ◦ Spectral clustering: NC 22
  66. 66. Future work Improve novelty and diversity figures: RM2 performance is similar to PureSVD in terms of nDCG but it fails in terms of diversity and novelty 23
  67. 67. Future work Improve novelty and diversity figures: RM2 performance is similar to PureSVD in terms of nDCG but it fails in terms of diversity and novelty Introduce more evidences in the LM framework apart from ratings: Content-based information (hybrid recommender) Temporal and contextual information (TARS & CARS) 23
  68. 68. Thank you! @dvalcarce http://www.dc.fi.udc.es/~dvalcarce
  69. 69. Time and Context in Language Models Time: X. Li and W. B. Croft: Time-based Language Models in CIKM 2003 K. Berberich, S. Bedathur, O. Alonso and G. Weikum: A language modeling approach for temporal information needs in ECIR 2010 Context: H. Rode and D. Hiemstra: Conceptual Language Models for Context-Aware Text Retrieval in TREC 2004 L. Azzopardi: Incorporating Context within the Language Modeling Approach for ad hoc Information Retrieval. PhD Thesis (2005) 25

×