In a modern recommender system, it is important to understand how products relate to each other. For example, while a user is looking for mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers. These two types of recommendations are referred to as substitutes and complements: substitutes are products that can be purchased instead of each other, while complements are products that can be purchased in addition to each other. In this talk I will present methods for automatically identifying networks of substitute and complementary relationships of products, using text from their online reviews. We treat this as a supervised learning problem, trained using product networks derived from user behavior data. The product graph allows users to navigate, explore and discover new and previously unknown products. The product graph can also be used to identify interesting product combinations, we can recommend outfits by matching a shirt with complementary trousers and a jacket. And, the graph can be used as a candidate-generation step in providing better and more context-relevant recommendations.
Bio:
Jure Leskovec is Chief Scientist at Pinterest and Assistant Professor of Computer Science at Stanford University. His work focuses on machine learning and data mining in large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, and his PhD in in machine learning from the Carnegie Mellon University and postdoctoral training at Cornell University. Jure also co-founded a machine learning startup Kosei which was acquired by Pinterest. You can follow him on Twitter @jure.
11. Object Graph: Products
Pins & product catalogs:
10s of millions of products
100s of millions product reviews
How do we build the product graph
Three components:
Link Prediction
Topic models
Product hierarchies
11Jure Leskove, Stanford University & Pinterest
13. Product Graph: Description
13
: cleaner; quieter
: cheaper; high power
: well made, easy to install
: fits perfectly, great value
Jure Leskove, Stanford University & Pinterest
15. Product Graph:What it does?
1. Understand the notions of
substitute and complement goods
is substitutable for
complements
15Jure Leskove, Stanford University & Pinterest
16. Product Graph:What it does?
2. Generate explanations
of why certain products are
preferred
“Good quality, soft, light
weight, the colors are
beautiful and exactly like
the picture!”
People prefer this
because:
16Jure Leskove, Stanford University & Pinterest
17. Product Graph:What it does?
3. Recommends
baskets of related items
Query: Suggested outfit:
Query: Suggested outfit:
17Jure Leskove, Stanford University & Pinterest
18. Product Graph: Overview
Building networks of products
Modeling: Can we use product data
to model product relationships?
Understanding: Can we explain
why people prefer certain products
over others?
18Jure Leskove, Stanford University & Pinterest
19. Problem Setting
Binary prediction task:
Given a pair of products, x and y, predict
whether they are related
(substitute/complementary)
Goal: Build a probabilistic model
that encodes
19Jure Leskove, Stanford University & Pinterest
20. Problem Setting
How to learn
from data
Train by maximum likelihood:
20
XComplementary
Not
Complementary
Jure Leskove, Stanford University & Pinterest
21. Attempt 1: Big bags of features
21
Features of product i:
[0,0,0,0,0,0,0,1,0,5,0,0,0, … ,0,1,0,0,0,0,0,1,2]
Features of product j:
[0,0,0,1,0,0,0,0,0,0,0,1,0, … ,0,0,0,0,0,0,0,1,0]
aardvark zoetrope
Jure Leskove, Stanford University & Pinterest
22. Attempt 1: Big bags of features
22
Features of product i:
[0,0,0,0,0,0,0,1,0,5,0,0,0, … ,0,1,0,0,0,0,0,1,2]
Features of product j:
[0,0,0,1,0,0,0,0,0,0,0,1,0, … ,0,0,0,0,0,0,0,1,0]
aardvark zoetrope
Parameterized probability measure
(essentially weighted-nearest-neighbor)
Jure Leskove, Stanford University & Pinterest
23. Attempt 1: Big bags of features
23
Features of product i:
[0,0,0,0,0,0,0,1,0,5,0,0,0, … ,0,1,0,0,0,0,0,1,2]
Features of product j:
[0,0,0,1,0,0,0,0,0,0,0,1,0, … ,0,0,0,0,0,0,0,1,0]
aardvark zoetrope
• High-dimensional
• Prone to overfitting
• Too fine-grained
Jure Leskove, Stanford University & Pinterest
24. Attempt 2: Features fromTopics
LDA
Shoes Female
Blei & McAuliffe (2007)
Product topics
Use any kind of
product related features:
brand, price, reviews,
product descriptions, …
Topic models:
24
FashionJure Leskove, Stanford University & Pinterest
25. Attempt 2: Features fromTopics
Features of product i:
[0.1, 0.4, 0.2, 0.1, 0.2]
Features of product j:
[0.3, 0.1, 0.3, 0.2, 0.1]
Shoes Female
25Jure Leskove, Stanford University & Pinterest
26. Attempt 2: Features fromTopics
On the right track, but are the
topics we are discovering
relevant to link prediction?
26
Features of product i:
[0.1, 0.4, 0.2, 0.1, 0.2]
Features of product j:
[0.3, 0.1, 0.3, 0.2, 0.1]
Shoes Female
Jure Leskove, Stanford University & Pinterest
27. Attempt 3: Learn “good” topics
Learn to discover topics that
explain the graph structure
27Jure Leskove, Stanford University & Pinterest
28. Attempt 3: Learn “good” topics
Link Prediction Product “topics”
Idea: Learn both simultaneously
Discover topics that “explain” product relations
28Jure Leskove, Stanford University & Pinterest
29. Attempt 3: Learn “good” topics
Conceptually, we want to learn to project
products into topic space such that
related products are nearby
29Jure Leskove, Stanford University & Pinterest
30. The SCEPTRE Model
Combining topic models with
link prediction
Topic model with topic distribution 𝜽𝜽
But, the topics should be “good” as
features for the link prediction 30Jure Leskove, Stanford University & Pinterest
31. The SCEPTRE Model: Details
31
Topic
membership
Jure Leskove, Stanford University & Pinterest
32. The SCEPTRE Model
why do people who view
X eventually buy Y?
There is a link between the two products because
people use similar words to describe them
But in what direction does the link flow?
Issue 1: Relationships we want to learn
are not symmetric
32Jure Leskove, Stanford University & Pinterest
33. The SCEPTRE Model
why do people why view
X eventually buy Y?
Solution: We solve this issue by learning
“relatedness” in addition to “directedness”
Relationships: Explained by product “properties”
“baby, pajamas, pants, colorful”
Directedness: Subjective/qualitative language
“true size, fits well, items are the same color as on the picture”
33Jure Leskove, Stanford University & Pinterest
34. Learning Multiple Graphs
35
browsed together
bought together
Issue 2: We want to learn multiple
relationships simultaneously
We could fit two independent models, but learning both at once:
1) Gives us more data on which to train the complete model
2) Helps with interpretability, since both relationships are explained in
terms of the same topicsJure Leskove, Stanford University & Pinterest
35. Learning Multiple Graphs
36
Solution: We fix this by learning multiple
regressors simultaneously (one for each graph),
that operate on a single set of topics
One regressor
per graph
Jure Leskove, Stanford University & Pinterest
36. Sceptre is Not tractable
37
Issue 3:The model has a too
many parameters
Thousands of topics multiplied by
millions of products
Jure Leskove, Stanford University & Pinterest
37. Including Hierarchy
Idea: use the
category
hierarchy to
sparsify the
model
Solution: Product hierarchy
38Jure Leskove, Stanford University & Pinterest
38. Including Hierarchy
39
Associate each node in the category
tree with a small number of topics:
Now we can fit models with
thousands of topics but only
10-20 are active per product
“Car audio” topics (for example)
have probability zero of being
selected for this product
Topics at the top of the hierarchy are
common to all electronics products, and
will contain generic (though electronics
specific) languageJure Leskove, Stanford University & Pinterest
39. Training the model: EM
40
E-step (topic assignments)
M-step (link prediction)
Other topic/regression
parameters (word distribution
𝜙𝜙 and topic assignments z)
Jure Leskove, Stanford University & Pinterest
40. Building the Product Graph
Now, we can generate the product graph
by identifying most probable links
For every product, rank all other products
according to p(x is related to y)
But this is slow!
Quadratic number of comparisons!
Solution: Use product hierarchy and a
matching engine
43Jure Leskove, Stanford University & Pinterest
41. Experiments
Just for fun, let’s use the Amazon
product catalog:
44Jure Leskove, Stanford University & Pinterest
43. Ranking Performance
Manual examination shows great performance
(false positives are actually very relevant)
46Jure Leskove, Stanford University & Pinterest
46. Explaining user preferences
Explain recommendations by identifying
words that “best explain” the link:
Topic model we assign a topic to each word
Logistic regressor uses the words to make predictions
Identify phrases that maximize the likelihood of the
link in order to explain it
49
Use the “directedness” model to generate explanations as
it selects more subjective language (i.e., how do the products
differ, and why was one product “preferable” over another).
Jure Leskove, Stanford University & Pinterest