RANKING AND DIVERSITY
IN RECOMMENDATIONS
Alexandros Karatzoglou
@alexk_z
Thanks

Linas Baltrunas
Telefonica

Yue Shi
TU Delft

Saul Vargas
Universidad Autonoma de Madrid

Pablo Castells
Universidad Autonoma de Madrid
Telefonica Research in Barcelona
•  User Modeling: Recommender Systems
•  Data Mining, Machine Learning
•  Multimedia Indexing and Analysis
•  HCI
•  Mobile Computing
•  Systems and Networking
•  http://www.tid.es
Recommendations in Telefonica
People You May Know@Tuenti
Recommendations in Telefonica
Firefox OS Marketplace
Collaborative Filtering

+
+

+
+

+

+
+

+
X ↵ik
fij = hUi Mj i i
hUk Mj i
, Mj +
|Fi |
+

k2Fi
Tensor Factorization
CP-Decomposition
HOSVD

Fijkijk S ⇥U Ui ,⇥M j , CkC Ck
F = = hU i M Mj ⇥ i
Karatzoglou et al. @ Recsys 2010
Publications in Ranking
CIKM 2013: GAPfm: Optimal Top-N Recommendations for Graded Relevance Domains
RecSys 2013: xCLiMF: Optimizing Expected Reciprocal Rank for Data with
Multiple Levels of Relevance
RecSys 2012: CLiMF: Learning to Maximize Reciprocal Rank with Collaborative
Less-is-More Filtering * Best Paper Award
SIGIR 2012: TFMAP: Optimizing MAP for Top-N Context-aware Recommendation
Machine Learning Journal, 2008: Improving Maximum Margin Matrix Factorization
* Best Machine Learning Paper Award at ECML PKDD 2008
RecSys 2010: List-wise Learning to Rank with Matrix Factorization for Collaborative Filtering
NIPS 2007: CoFiRank - Maximum Margin Matrix Factorization for Collaborative Ranking
Recommendations are ranked lists
Popular Ranking Methods
•  In order to generate the ranked item list, we need some

relative utility score for each item
•  Popularity is the obvious baseline
•  Score could depend on the user (personalized)
•  Score could also depend on the other items in the list (list-wise)

•  One popular way to rank the items in RS is to sort the

items according to the rating prediction
•  Works for the domains with ratings
•  Wastes the modeling power for the irrelevant items
Graphical Notation

Relevant	
  
Irrelevant	
  
Irrelevant	
  
Irrelevant	
  
Irrelevant	
  
Irrelevant	
  
Relevant	
  
Relevant	
  
Ranking using latent
representation
•  If user = [-100, -100]
•  2d latent factor
•  We get the corresponding ranking
Matrix Factorization
(for ranking)
•  Randomly initialize item vectors
•  Randomly initialize user vectors
•  While not converged
•  Compute rating prediction error
•  Update user factors
•  Update item factors
•  Lets say user is [-100, -100]
•  Compute the square error
•  (5-<[-100, -100], [0.180, 0.19]>)2=1764

•  Update the user and item to the direction

where the error is reduced
(according to the gradient of the loss)

8 items with ratings
and random factors
Learning: Stochastic Gradient
Descent with Square Error Loss
SquareUser: [3, 1], RMSE=6.7
Loss
Learning	
  to	
  Rank	
  for	
  Top-­‐k	
  RecSys	
  
•  Usually	
  we	
  care	
  about	
  accurate	
  ranking	
  and	
  not	
  ra=ng	
  

predic=on	
  

•  Square	
  Error	
  loss	
  op=mizes	
  to	
  accurately	
  predict	
  1s	
  and	
  5s.	
  

•  RS	
  should	
  get	
  the	
  top	
  items	
  right	
  -­‐>	
  Ranking	
  problem	
  
•  Why	
  not	
  to	
  learn	
  how	
  to	
  rank	
  directly?	
  
•  Learning	
  to	
  Rank	
  methods	
  provide	
  up	
  to	
  30%	
  performance	
  
improvements	
  in	
  off-­‐line	
  evalua=ons	
  
•  It	
  is	
  possible,	
  but	
  a	
  more	
  complex	
  task	
  
Example: average precision (AP)
•  AP: we compute the precision at each relevant position

and average them

AP =

|S|
X

P (k)

k=1

|S|

P@1+ P@2 + P@4 1 /1+ 2 / 2 + 3 / 4
=
= 0.92
3
3
Why is hard? Non Smoothness
Example: AP

u:[-20,-20]

u:[20,20]
AP vs RMSE
The Non-smoothness of
Average Precision
AP =

|S|
X

P (k)

k=1

|S|

APm = PN

1

i=1

ymi

N
N
X ymi X
i=1

rmi

j=1

ymj I(rmj  rmi )

ymi

is 1 if item i is relevant for user m and 0 otherwise

I(·)

indicator function (1 if it is true, 0 otherwise)

rmi Rank of item i for user m
How can we get a smooth-AP?
•  We	
  replace	
  non	
  smooth	
  parts	
  of	
  MAP	
  with	
  smooth	
  

approxima=on	
  

1

rmi

⇡ g(fmi ) = g(hUm , Vi i)

g(x) = 1/(1 + e

x

)
How can we get a smooth-MAP?
•  We	
  replace	
  non	
  smooth	
  parts	
  of	
  MAP	
  with	
  smooth	
  

approxima=on	
  

I(rmj  rmi ) ⇡ g(fmj

fmi ) = g(hUm , Vj

Vi i)

g(x) = 1/(1 + e

x

)
Smooth version of MAP

u:[-20,-20]

u:[20,20]
Sometimes approximation is not very
good…
Ranking Inconsistencies
•  Achieving a perfect ranking for all users is not possible
•  Two Sources of Inconsistencies:
•  1) Factor Models (all models) have limited expressive

power and cannot learn the perfect ranking for all users
•  2) Ranking functions approximations are inconsistent e.g.

A >B & B>C but C > A
Summary on Ranking 101
Area Under the ROC Curve (AUC)

1
AU C := +
|S ||S |

S+ X
XS
i

j

I(Ri < Rj )
Reciprocal Rank (RR)

1
RR :=
Ri
Average Precision (AP)

AP =

|S|
X

P (k)

k=1

|S|
AP vs RR
Normalized Discounted Cumulative Gain
(nDCG)

DCG =

X 2score(i)
i

1
log2 (i + 2)
Relevance solved! Is that all?
•  Ranking “solves” the relevance problem
•  Can we be happy with the results?
Relevance solved! Is that all?
•  Coverage
•  Diversity
•  Novelty
•  Serendipity
Diversity in Recommendations
•  Diversity using Genres
•  Movies, Music, Books
•  Diversity should fulfill:
•  Coverage
•  Redundancy
•  Size Awareness
Diversity methods for RecSys
•  Topic List Diversification or
•  Maximal Margin Relevance

fM M R (i; S) = (1

) rel(i) +

min dist(i, j)
j2S
Diversity methods for RecSys
•  Intent-Aware IR metrics

ERR

IA =

X
s

p(s)ERR
Example

Action
Comedy
Sci-Fi
Western

Action
Thriller
Sci-Fi
Western

Adventure
Western
Genres

Action (517)
154

Comedy (1267)

125

8
11

6

60

37
4

146

2
639

187

21

9
0

30

8

3
9

4

4

232

171
11

1
75

Drama (1711)

86

10

202

870
Romance (583)

46

Thriller (600)
Genres and Popularity
Binomial Diversity
•  We base a new Diversity Metric on the Binomial
Distribution

✓ ◆
N k
P (X = k) =
p (1
k

p)

N

k
User Genre Relevance
•  Fraction of Items of genre “g” user interacted
with
Iu

p00
g

kg
=
|Iu |

•  Global fraction of items of genre “g”

•  Mix

P Iu
u kg
0
pg = P
u |Iu |

pg = (1

↵)

0
pg

+↵

00
pg
Coverage
•  Product of the probabilities of the genres not represented
in the list not being picked by random

Coverage(R) =

Y

g 2G(R)
/

P (Xg = 0)

1/|G|
Non-Redundancy
P (Xg

k | Xg > 0) = 1

N onRed(R) =

Y

g2G(R)

k 1
X
l=1

P (Xg

P (Xg = l | Xg > 0)
R
kg | Xg > 0)1/|G(R)|
Non-Redundancy
Binomial Diversity

BinomDiv(R) = Coverage(R) · N onRed(R)
Re-Ranking
•  Re-Rank based on Relevance and Binomial Diversity

fBinomDiv (i; S) = (1

) normrel (rel(i)) + normdiv (div(i; S))

normX (x) =

x

µX
X
Example
Thanks!
•  Questions?

Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, Berlin