Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
CERI 2016, GRANADA, SPAIN
ADDITIVE SMOOTHING FOR RELEVANCE-BASED
LANGUAGE MODELLING OF RECOMMENDER SYSTEMS
Daniel Valcarce...
Outline
1. Recommender Systems
2. Pseudo-Relevance Feedback
3. Relevance-Based Language Modelling of Recommender
Systems
4...
RECOMMENDER SYSTEMS
Recommender Systems
Recommender systems generate personalised suggestions for
items that may be of interest to the users.
...
PSEUDO-RELEVANCE FEEDBACK
Pseudo-Relevance Feedback (I)
In Information Retrieval, Pseudo-Relevance Feedback (PRF) is
an automatic query expansion me...
Pseudo-Relevance Feedback (II)
Information need
6/26
Pseudo-Relevance Feedback (II)
Information need
query
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
Query
Expansion
expanded
query
6/26
Pseudo-Relevance Feedback (II)
Information need
query Retrieval
System
Query
Expansion
expanded
query
6/26
RELEVANCE-BASED LANGUAGE MODELLING
OF RECOMMENDER SYSTEMS
Pseudo-Relevance Feedback for Collaborative Filtering
PRF CF
User’s query User’s profile
mostˆ1,populatedˆ2,stateˆ2 Titanic...
Relevance-Based Language Models (RM)
Relevance-Based Language Models or Relevance Models (RM)
are a state-of-the-art PRF t...
Relevance-Based Language Models (RM)
Relevance-Based Language Models or Relevance Models (RM)
are a state-of-the-art PRF t...
Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items...
Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items...
Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items...
Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items...
Relevance Models for Collaborative Filtering
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v) p(v)
p(i)
p(j|v)
Iu is the set of items...
Collection-based Smoothing Techniques (I)
Absolute Discounting (AD)
pδ(i|u)
max(ru,i − δ, 0) + δ |Iu| p(i|C)
j∈Iu
ru,j
Jel...
Collection-based Smoothing Techniques (II)
Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have
been studied in ...
Collection-based Smoothing Techniques (II)
Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have
been studied in ...
Collection-based Smoothing Techniques (II)
Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have
been studied in ...
IDF EFFECT AND ADDITIVE SMOOTHING
Axiomatic Analysis of the IDF Effect in IR
A recent work performed an axiomatic analysis of several PRF
methods (Hazimeh &...
Axiomatic Analysis of the IDF Effect in IR
A recent work performed an axiomatic analysis of several PRF
methods (Hazimeh &...
The IDF Effect in Recommendation (I)
This retrieval idea is related to the novelty in recommendation.
Definition (IDF effe...
The IDF Effect in Recommendation (II)
We performed an axiomatic analysis of RM21 using the
following smoothing methods:
Di...
The IDF Effect in Recommendation (II)
We performed an axiomatic analysis of RM21 using the
following smoothing methods:
Di...
The IDF Effect in Recommendation (II)
We performed an axiomatic analysis of RM21 using the
following smoothing methods:
Di...
EXPERIMENTS
Experimental settings
Datasets:
Movielens 100k
Movielens 1M
Metrics:
Ranking accuracy: nDCG.
Diversity: the complement of ...
Ranking accuracy
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.40
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01...
Diversity
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
Gini@10
δ, λ, µ ×...
Novelty
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
11.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
MSI@10
δ, λ, µ...
G-measure of nDCG, Gini and MSI
0.2
0.3
0.4
0.5
0.6
0.7
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.001 0.01 0.1 1 10
G(Gini...
CONCLUSIONS AND FUTURE DIRECTIONS
Conclusions
The IDF effect from IR is related to the novelty of the
recommendations.
The use of collection-based smoothing ...
Future work
Envision new ways of enhancing the IDF effect in RM2:
Design smoothing methods that actively promote the IDF
eff...
THANK YOU!
@DVALCARCE
http://www.dc.fi.udc.es/~dvalcarce
Upcoming SlideShare
Loading in …5
×

Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]

377 views

Published on

Slides of the presentation given at CERI 2016 for the following paper:

Daniel Valcarce, Javier Parapar, Alvaro Barreiro: Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems. CERI 2016: Article 9.

http://dx.doi.org/10.1145/2934732.2934737

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Additive Smoothing for Relevance-Based Language Modelling of Recommender Systems [CERI '16 Slides]

  1. 1. CERI 2016, GRANADA, SPAIN ADDITIVE SMOOTHING FOR RELEVANCE-BASED LANGUAGE MODELLING OF RECOMMENDER SYSTEMS Daniel Valcarce, Javier Parapar, Álvaro Barreiro @dvalcarce @jparapar @AlvaroBarreiroG Information Retrieval Lab @IRLab_UDC University of A Coruña Spain
  2. 2. Outline 1. Recommender Systems 2. Pseudo-Relevance Feedback 3. Relevance-Based Language Modelling of Recommender Systems 4. IDF Effect and Additive Smoothing 5. Experiments 6. Conclusions and Future Directions 1/26
  3. 3. RECOMMENDER SYSTEMS
  4. 4. Recommender Systems Recommender systems generate personalised suggestions for items that may be of interest to the users. Top-N Recommendation: create a ranking of the N most relevant items for each user. Collaborative filtering: exploit only user-item interactions (ratings, clicks, etc.). 3/26
  5. 5. PSEUDO-RELEVANCE FEEDBACK
  6. 6. Pseudo-Relevance Feedback (I) In Information Retrieval, Pseudo-Relevance Feedback (PRF) is an automatic query expansion method. The goal is to expand the original query with new terms to improve the quality of the search results. These new terms are extracted automatically from a first retrieval using the original query. 5/26
  7. 7. Pseudo-Relevance Feedback (II) Information need 6/26
  8. 8. Pseudo-Relevance Feedback (II) Information need query 6/26
  9. 9. Pseudo-Relevance Feedback (II) Information need query Retrieval System 6/26
  10. 10. Pseudo-Relevance Feedback (II) Information need query Retrieval System 6/26
  11. 11. Pseudo-Relevance Feedback (II) Information need query Retrieval System 6/26
  12. 12. Pseudo-Relevance Feedback (II) Information need query Retrieval System 6/26
  13. 13. Pseudo-Relevance Feedback (II) Information need query Retrieval System Query Expansion expanded query 6/26
  14. 14. Pseudo-Relevance Feedback (II) Information need query Retrieval System Query Expansion expanded query 6/26
  15. 15. RELEVANCE-BASED LANGUAGE MODELLING OF RECOMMENDER SYSTEMS
  16. 16. Pseudo-Relevance Feedback for Collaborative Filtering PRF CF User’s query User’s profile mostˆ1,populatedˆ2,stateˆ2 Titanicˆ2,Avatarˆ3,Matrixˆ5 Documents Neighbours Terms Items 8/26
  17. 17. Relevance-Based Language Models (RM) Relevance-Based Language Models or Relevance Models (RM) are a state-of-the-art PRF technique (Lavrenko & Croft, SIGIR 2001). Two models: RM1 and RM2. RM1 works better than RM2 in retrieval. 9/26
  18. 18. Relevance-Based Language Models (RM) Relevance-Based Language Models or Relevance Models (RM) are a state-of-the-art PRF technique (Lavrenko & Croft, SIGIR 2001). Two models: RM1 and RM2. RM1 works better than RM2 in retrieval. Relevance Models have been recently adapted to collaborative filtering (Parapar et al., IPM 2013). For recommendation, RM2 is the preferred method. 9/26
  19. 19. Relevance Models for Collaborative Filtering RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u. Vu is neighbourhood of the user u. This is computed using a clustering algorithm. p(i) and p(v) are the item and user priors. p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection. 10/26
  20. 20. Relevance Models for Collaborative Filtering RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u. Vu is neighbourhood of the user u. This is computed using a clustering algorithm. p(i) and p(v) are the item and user priors. p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection. 10/26
  21. 21. Relevance Models for Collaborative Filtering RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u. Vu is neighbourhood of the user u. This is computed using a clustering algorithm. p(i) and p(v) are the item and user priors. p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection. 10/26
  22. 22. Relevance Models for Collaborative Filtering RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u. Vu is neighbourhood of the user u. This is computed using a clustering algorithm. p(i) and p(v) are the item and user priors. p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection. 10/26
  23. 23. Relevance Models for Collaborative Filtering RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v) p(v) p(i) p(j|v) Iu is the set of items rated by the user u. Vu is neighbourhood of the user u. This is computed using a clustering algorithm. p(i) and p(v) are the item and user priors. p(i|u) is computed smoothing the maximum likelihood estimate with the probability in the collection. 10/26
  24. 24. Collection-based Smoothing Techniques (I) Absolute Discounting (AD) pδ(i|u) max(ru,i − δ, 0) + δ |Iu| p(i|C) j∈Iu ru,j Jelinek-Mercer (JM) pλ(i|u) (1 − λ) ru,i j∈Iu ru,j + λ p(i|C) Dirichlet Priors (DP) pµ(i|u) ru,i + µ p(i|C) µ + j∈Iu ru,j 11/26
  25. 25. Collection-based Smoothing Techniques (II) Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have been studied in the context of: Text Retrieval (Zhai & Lafferty, ACM TOIS 2004) Collaborative Filtering (Valcarce et al., ECIR 2015) 12/26
  26. 26. Collection-based Smoothing Techniques (II) Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have been studied in the context of: Text Retrieval (Zhai & Lafferty, ACM TOIS 2004) ◦ Absolute Discounting performs very poorly. ◦ Dirichlet Priors is the most popular approach. ◦ Jelinek-Mercer is a bit better for long queries. Collaborative Filtering (Valcarce et al., ECIR 2015) ◦ Absolute Discounting is the best smoothing method. 12/26
  27. 27. Collection-based Smoothing Techniques (II) Absolute Discounting, Jelinek-Mercer and Dirichlet Priors have been studied in the context of: Text Retrieval (Zhai & Lafferty, ACM TOIS 2004) ◦ Absolute Discounting performs very poorly. ◦ Dirichlet Priors is the most popular approach. ◦ Jelinek-Mercer is a bit better for long queries. Collaborative Filtering (Valcarce et al., ECIR 2015) ◦ Absolute Discounting is the best smoothing method. Can we do better? 12/26
  28. 28. IDF EFFECT AND ADDITIVE SMOOTHING
  29. 29. Axiomatic Analysis of the IDF Effect in IR A recent work performed an axiomatic analysis of several PRF methods (Hazimeh & Zhai, ICTIR 2015). They found out that RM1 with Dirichlet Priors and Jelinek-Mercer smoothing methods demote the IDF effect. The IDF effect is a desirable property that, intuitively, promotes documents with very specific terms. 14/26
  30. 30. Axiomatic Analysis of the IDF Effect in IR A recent work performed an axiomatic analysis of several PRF methods (Hazimeh & Zhai, ICTIR 2015). They found out that RM1 with Dirichlet Priors and Jelinek-Mercer smoothing methods demote the IDF effect. The IDF effect is a desirable property that, intuitively, promotes documents with very specific terms. Can we use this result in recommendation? What is the IDF effect in recommendation? Is it a desirable property? They studied RM1, what about RM2? 14/26
  31. 31. The IDF Effect in Recommendation (I) This retrieval idea is related to the novelty in recommendation. Definition (IDF effect) A recommender system supports the IDF effect if p(i1|Ru) > p(i2|Ru) when two items i1 and i2 have the same ratings r(v, i1) r(v, i2) for all v ∈ Vu and different popularity p(i1|C) < p(i2|C) In simply words, if we have the same feedback for two items, we should recommend the least popular one. 15/26
  32. 32. The IDF Effect in Recommendation (II) We performed an axiomatic analysis of RM21 using the following smoothing methods: Dirichlet Priors Jelinek-Mercer Absolute Discounting 1Math proofs in the paper! 16/26
  33. 33. The IDF Effect in Recommendation (II) We performed an axiomatic analysis of RM21 using the following smoothing methods: Dirichlet Priors Jelinek-Mercer Absolute Discounting 1Math proofs in the paper! 16/26
  34. 34. The IDF Effect in Recommendation (II) We performed an axiomatic analysis of RM21 using the following smoothing methods: Dirichlet Priors Jelinek-Mercer Absolute Discounting Additive Smoothing pγ(i|u) r(u, i) + γ j∈Iu r(u, j) + γ|I| 1Math proofs in the paper! 16/26
  35. 35. EXPERIMENTS
  36. 36. Experimental settings Datasets: Movielens 100k Movielens 1M Metrics: Ranking accuracy: nDCG. Diversity: the complement of the Gini index. Novelty: mean self-information (MSI). 18/26
  37. 37. Ranking accuracy 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 nDCG@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 nDCG@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) Figure: Values of nDCG@10 on MovieLens 100k (left) and 1M (right). 19/26
  38. 38. Diversity 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 Gini@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) 0.00 0.01 0.02 0.03 0.04 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 Gini@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) Figure: Values of Gini@10 on MovieLens 100k (left) and 1M (right). 20/26
  39. 39. Novelty 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 MSI@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 MSI@10 δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) Figure: Values of MSI@10 on MovieLens 100k (le ft) and 1M (right). 21/26
  40. 40. G-measure of nDCG, Gini and MSI 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 G(Gini@10,MSI@10,nDCG@10) δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) 0.1 0.2 0.3 0.4 0.5 0.6 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.001 0.01 0.1 1 10 G(Gini@10,MSI@10,nDCG@10) δ, λ, µ × 103 γ Additive (γ) Absolute Discounting (δ) Jelinek-Mercer (λ) Dirichlet Priors (µ) Figure: Values of the geometric mean among nDCG@10, Gini@10 and MSI@10 on MovieLens 100k (left) and 1M (right). 22/26
  41. 41. CONCLUSIONS AND FUTURE DIRECTIONS
  42. 42. Conclusions The IDF effect from IR is related to the novelty of the recommendations. The use of collection-based smoothing methods with RM2 demotes the IDF effect. Additive smoothing is a simple method that does not demote (nor promote) the IDF effect. Additive smoothing provides better accuracy, diversity and novelty figures than collection-based smoothing methods. 24/26
  43. 43. Future work Envision new ways of enhancing the IDF effect in RM2: Design smoothing methods that actively promote the IDF effect. Use non-uniform prior estimates. Study axiomatically other IR properties that can be useful in recommendation. 25/26
  44. 44. THANK YOU! @DVALCARCE http://www.dc.fi.udc.es/~dvalcarce

×