Community Recommendation on
LinkedIn
Observed preference
user u joins a community y (u,y)
The recommendation problem
Given a set of (u, y) tuples, predict a set R(u) for each
user which are the recommendations for a user u.
A content-based approach
Owing to the rich profile data for users, we use a contentbased model that computes similarity between users and
groups.
An intuitive logistic model (pointwise)
fu, fy: features of user u and community y
wi : parameters for the model
Communities that a user has joined are
relevant.
Can pairwise learning help for
community recommendation?
● A reliable technique used in search engines. [Joachims
01]
● Has been proposed for some collaborative filtering
models. [Rendle et al. 09, Pessiot et al. 07]
● Empirical evidence shows promising results.
[Balakrishnan and Chopra 10]
Caveat
Learning time is quadratic in number of communities.
How fast is the inference?
Outline
● Propose pairwise models for content-based
recommendation
● Augment pairwise learning with a latent
preference model
● Show both offline and online evaluation on
linkedin data for our proposed models
Expressing pairwise preference
We establish a pair (yi, yj) if yi was ranked higher than yj
and only yj was selected by the user.
We can define a ranking function h such that:
Building a pairwise logistic
recommender
Maximizing the likelihood of observed preference among
pairs:
Model 1: Feature Difference Model
Assuming h to be a linear function,
Equivalent to logistic classification with features
(yj - yi)
Ranking: Can simply rank by computing
for each community
Model 2: Logistic Loss Model
Assuming a more general ranking function:
Ranking: As long as we choose h to be a nondecreasing function, we can still rank by computing
weighted sum of features for each community.
Pairwise learning improves the
classification of pairs
Task: For each pair, predict which community is
more preferred by a user
...but the gains are only slight.
Digging deeper: Joining statistics
for LinkedIn communities
Random sample, 1M users
FACT: Most users join
different types of groups.
Possible hypothesis: There
are different reasons for
joining different types of
groups.
Digging deeper: the effect of group
types
PREFERRED
ML
Group
Interest Feature
>
User1
Cornell
Alumni
Education Feature
PREFERRED
Cornell
Alumni
Education Feature
>
User2
ML
Group
Interest Feature
When learning a single weight for each feature, varying
preferences of users may cancel out the effects.
Different reasons for joining a
community can be treated as a set of
latent preferences within a user
Pair of
communities
User
Core
preference
Model 3: Pairwise PLSI model
Extend the Probabilistic Latent Semantic
Indexing recommendation model for pairwise
learning [Hofmann 02]
We assume users are composed of a set of
latent preferences. Each user differs in how she
combines the available latent preferences.
Latent preferences over pairs help
retain differing user preferences
ML
Group
Interest Feature
>
User1
Cornell
Alumni
Education Feature
Cornell
Alumni
z1
Education Feature
>
User2
ML
Group
Interest Feature
User1 puts more weight to z1’s preference.
User2 puts more weight to z2’s preference.
z2
Some details about the model
Number of core preferences (Z)
small ~ {2, 4, 8}
Choosing probability models
Use logistic loss or feature difference for modeling
conditional preference.
Multinomial model for modeling the probability of a latent
preference given a user.
Ranking
Thus, we can still rank communities individually
(without constructing pairs).
Online evaluation
● Tested out Logistic Loss and Feature
Difference models on 5% of LinkedIn users,
and the baseline model on the rest
● Measured average click-through-rate (CTR)
over 2 weeks
● Feature difference reported a 5% increase in
CTR, logistic loss reported 3%.
Conclusion: Pairwise learning can
be a useful addition.
However, gains may depend on the context /
domain.
Important to understand and model the special
characteristics of a target domain.
thank you
Amit Sharma, @amt_shrma
www.cs.cornell.edu/~asharma