Jake Mannix, MLconf 2013

Content-based
Recommender Systems
jake@ twitter.com @pbrane
User Interest Modeling, Twitter Inc.
Apache Mahout PMC
(previously: LinkedIn, a bunch of tiny startups)

Overview
• Collaborative Filtering == RecSys?
• User/Item content and/or metadata
• RecSys training w/ user/item features
• Advantages / Disadvantages
• Historical production examples from
Twitter and LinkedIn

Recommender System

“traditional” RecSys

Aside: “users” don’t
have to be users

• At LinkedIn, the Recommender Systems
team built a general-purpose entity-toentity RecSys: Product [user, item]

•
•
•
•

TalentMatch [job posting, user-proﬁle]
GroupsYouMayLike [user, group]
{Jobs for your group} [group, job-posting]
AdsYouMayBeInterestedIn [user, ad]

Anmol Bhasin, Monica Rogati (now VP of Data at Jawbone), and
myself built...
So how do recommender systems work?
next page: “an artists depiction of collaborative ﬁltering”

CF is Generic!
• users / items reduced to GUIDs, could be
anything

• large body of acad. work on techniques:
•
•

SVD, ALS + other matrix factorizations
stacked RBM, etc.

• General purpose OSS CF recommender:
• Apache Mahout (http://mahout.apache.org)

What about Domain
Specific Knowledge?
• Items are more than just GUIDs
• Users are more than just account names
• Perhaps, they are both much more:
• user profile text on Facebook / LinkedIn
• webdoc content + metadata
• movie genres, directors, description
• ad landing page content
could be derived data: at Twitter, every piece of text content
passing through the system gets classified into topical categories,
and users get classified according to their topical interests and
things they’re “known for”

User/Item Features
• each user has a feature-vector
• each item has a feature-vector
• dimensionalities may (will!) differ
• collectively, we thus have MOAR Matrices!

next page: more art!

Feature Matrices

we could decompose this resultant
user/item-feature matrix...

slight misrepresentation: user-features along rows of ﬁrst matrix,
columns are user-ids
note: the “multiplication” here could be actual matrix mult, OR
maybe a more bayesian / statistical form: p(user|user-feature),
p(item|user), p(item-feature|item) -> p(positive engagement |
user-features, item-features). Full joint distribution ->HARD.
Naive Bayes? or...

Train a ranker/classiﬁer
• take a column of user-feature matrix:
• row of item-feature matrix:
in
• embed
• train classiﬁer to predict ratings given

go back and forth on this page to the previous one
next page is some notes about this

Training, cont.
• note: no need for any relationship between
features

• if you apply a discretization technique, don’t
even need to care about correlation
between +/- values and “goodness/badness”

Classifier/Ranker
RecSys HOWTO
• incoming preferences are triples of

(user-feature vector, item-feature vector,
preference-value)

• train classifier (online if desired!), and
• trained classifier spits out predicted rating
given user/item pairs

• note: may require some item preselection
predicted ratings may not be what you want, it may be a Learn To
Rank setup

Variations

• What if your features have some structure?

Structured data
• Item = { field1, field2, field3, ... }
• User = { fieldA, fieldB, ... }
• field1: tf-idf-weighted “position description”
• field2 : standardized categorical job title
• field3 : #years experience
• fieldA : tf-idf “job requirements”
• fieldB : #years of experience required
note: this is TalentMatch: here “items” are LI profiles, and “users”
are job postings

Pairwise-field similarity
• Some fields are naturally comparable to

others, can compute vector cosine, jaccard,
etc.

• Others have a business-specific similarity

f(#years experience - required experience)

• Each set of field pairs generates an
untrained weight

Train a low-dimensional
classifier/ranker
• take these O(|item_fields| x |user_fields|)
weights and feed into the training of a
ranker.

• given low number of features, very
interpretable

interpretation: p(user is good for job) = w_jobtitle+headline *
sim(jobtitle, headline) + w_jobdesc+headline * sim(jobdesc,
headline) + w_jobdesc+currentdesc * sim(jobdesc, currentdesc)
+ ...

Content-based RecSys:
Pros
• Fixes cold-start problem
• Scales fantastically
• Flexible: can Learn To Rank using LR, SVM,
GBDT, whatever

other content approach to cold-start: unsupervised similarity to
engaged-with items/users + CF
scales: use as much data as your classiﬁer/ranker needs to
converge well. once trained, can often be a very low latency
method of generating item scores. Many classiﬁers are extremely
insensitive to #features input

Content-based RecSys:
Cons

• Not always very general (although: pairwise
crossed features are pretty general)

• Features may be too coarse
• Feature selection may be difﬁcult
• Low-latency from large item sets is hard
• Underweights popularity, similarity to
known good items

for item selection: clustering can’t always work very well, if using
crossed features, but sometimes tricks like LSH can help

Hybrid Models
• Classiﬁers/Regression models yield scores
• can combine this score with preferences
from CF

• alternately, generate top-K items via CF,
rerank with your content-based ranker
(using CF rank as another feature)

Examples
• LinkedIn’s original (2010) generic entity-toentity RecSys was primarily content-based

• Twitter’s #discover product is a hybrid

recommender with content, social, and CF
features

Note: PYMK is not primarily content based
Also: personalized search is naturally a hybrid content-based
recommender

Conclusion
• Free paper title: “On the unreasonable

effectiveness of CF on the consumer web”

• But if you do know features about users/
items: learn to rank using them!

• This is more common than you might

think, in industry. But everyone’s got
different domain-speciﬁc features, so less
research about it

CF works absurdly well, given how little it knows about the items
it’s recommending
Riff on Hardy’s “On the unreasonable effectiveness of mathematics
in the physical sciences...”

Questions?

jake@twitter.com
@pbrane
LinkedIn/G+ : jakemannix

Jake Mannix, MLconf 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Jake Mannix, MLconf 2013

Similar to Jake Mannix, MLconf 2013 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Jake Mannix, MLconf 2013